All jobs · Data Engineer jobs

Data Engineer (W2 only)

Logix Guru

Pittsburgh · Hybrid Contract Senior 2w ago

Apply with a tailored resume Save job

About the role

Role

Data Engineer

Duration

Contractor, 6–12 months (with possible extension)

Location

Pittsburgh, PA - Hybrid

Overview

We are seeking an experienced Data Engineer contractor to support our steel manufacturing operations. This individual will design, build, and optimize data pipelines and infrastructure, enabling advanced analytics, process automation, and data‑driven decision‑making.
The Data Engineer will work closely with data scientist, process engineering, and IT teams to ensure data reliability and actionable insights across the manufacturing lifecycle.

Key Responsibilities

Develop/maintain scalable and reliable data pipelines for industrial data (like real‑time streaming, time series, IoT, sensors, MES, ERP systems data)
Integrate data from different sources (databases, clouds, on‑premises) and Engineer workflows for efficient ETL/ELT processing and data validation.
Collaborate with architects, data engineers, data scientists, analysts, and business stakeholders to define and deliver solutions.
Collaborate with IT admins, network/security engineers, and cross‑functional teams to support stable production operations and troubleshoot infrastructure issues (including managing and integrating IaaC, PaaS, and SaaS solutions).
Capable of managing backlog, supporting QA/testing, and communicating requirements with business stakeholders in the steel manufacturing domain.
Mentoring team members, providing guidance, facilitating skill growth, offering technical coaching, and encouraging best practices across teams via code reviews.
Build and maintain data infrastructure in compliance with data governance and security best practices.

Requirements

Bachelor’s degree in computer science or related fields with 5+ years’ experience as a Data Engineer.
Strong experience in building, maintaining, and optimizing ETL/ELT data pipelines using Python, Pandas, PySpark and orchestrating workflows like Apache Airflow and Kedro framework.
Advanced SQL/ KQL query development and optimization across Oracle, MSSQL, and MySQL databases (hosted on‑premises or via PaaS offerings).
Developing and consuming Flask‑based and Fast API RESTful APIs for data services and integration.
Proficiency in Linux shell scripting for automation and data workflow management.
Experience with DevOps practices, including CI/CD for data pipelines and use of tools such as Git, Docker, and IaaC frameworks for provisioning and deployment.
Hands‑on experience deploying solutions across multiple clouds (OCI, Azure, GCP), including the setup of cross‑cloud data integration and transfer techniques.
Experience with cloud platforms (OCI, Azure, Google) and big data tools (Spark, Hadoop, Kafka, Databricks).
Understanding data modeling, data profiling, data quality, data lake/warehouse architectures, and data indigestion from operational technologies.
Familiarity with industrial protocols, time‑series databases (like OSIsoft PI), and manufacturing data (MES, PLC).
Strong troubleshooting, process automation, and root‑cause analysis skills.

Preferred Skills

Role Responsibility Area; Preferred Tools & Skills
- Data Ingestion Pipeline: Python, PySpark, Airflow, Kedro, Linux shell scripting
- API Development: Flask, Fast API, RESTful design
- Data Storage & Querying: SQL (Oracle, MSSQL, MySQL), KQL (Azure Data Explorer), Bigdata (Hadoop, Oracle BDS), OSIsoft PI
- Cloud Integration: Multi‑cloud platforms (OCI, Azure, GCP), Data Sharing across cloud (Data Bricks)
- Real‑Time Data Streaming: Kafka, Azure Event Hub, EMQX
- Reporting Tools: Tableau, OAC, Power BI
- Collaboration: Wiki, Azure DevOps Boards, MS Office 365
- Data Governance & Quality: Data profiling/validation tools (Pandas Profiling), SaaS monitoring (e.g., Great Expectations), lineage tracking (Cloud Data Catalogs)

Requirements

Strong experience in building, maintaining, and optimizing ETL/ELT data pipelines using Python, Pandas, PySpark and orchestrating workflows like Apache Airflow and Kedro framework.
Advanced SQL/ KQL query development and optimization across Oracle, MSSQL, and MySQL databases (hosted on-premises or via PaaS offerings).
Developing and consuming Flask-based and Fast API RESTful APIs for data services and integration.
Proficiency in Linux shell scripting for automation and data workflow management.
Experience with DevOps practices, including CI/CD for data pipelines and use of tools such as Git, Docker, and IaaC frameworks for provisioning and deployment.
Hands-on experience deploying solutions across multiple clouds (OCI, Azure, GCP), including the setup of cross-cloud data integration and transfer techniques.
Experience with cloud platforms (OCI, Azure, Google) and big data tools (Spark, Hadoop, Kafka, Databricks)
Understanding data modeling, data profiling, data quality, data lake/warehouse architectures, and data indigestion from operational technologies.
Familiarity with industrial protocols, time-series databases (like OSIsoft PI), and manufacturing data (MES, PLC)
Strong troubleshooting, process automation, and root-cause analysis skills

Responsibilities

Develop/maintain scalable and reliable data pipelines for industrial data (like real-time streaming, time series, IoT, sensors, MES, ERP systems data)
Integrate data from different sources (databases, clouds, on-premises) and Engineer workflows for efficient ETL/ELT processing and data validation.
Collaborate with architects, data engineers, data scientists, analysts, and business stakeholders to define and deliver solutions.
Collaborate with IT admins, network/security engineers, and cross-functional teams to support stable production operations and troubleshoot infrastructure issues (including managing and integrating IaaC, PaaS, and SaaS solutions).
Capable of managing backlog, supporting QA/testing, and communicating requirements with business stakeholders in the steel manufacturing domain.
Mentoring team members, providing guidance, facilitating skill growth, offering technical coaching, and encouraging best practices across teams via code reviews.
Build and maintain data infrastructure in compliance with data governance and security best practices

Skills

Apache AirflowAzureAzure DevOps BoardsDatabricksDockerEMQXFast APIFlaskGCPGitHadoopIaaCKafkaKedro frameworkKQLLinux shell scriptingMS Office 365MSSQLMySQLOCIOracleOSIsoft PIPandasPower BIPythonPySparkRESTfulSparkSQLTableauWiki

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Data Engineer (W2 only)

About the role

Role

Duration

Location

Overview

Key Responsibilities

Requirements

Preferred Skills

Requirements

Responsibilities

Skills

Similar roles

Machine Learning Engineer Focused on Fraud Prevention and System Optimization

Senior DevOps Engineer (gn) – Lead Role

Network Engineering Manager

Don't send a generic resume