Skip to content
mimi

Data Engineer Intermediate

Jobs via Dice

New York · On-site Full-time Mid Level Yesterday

About the role

Job Summary

We are seeking a skilled Data Engineer to design, build, and manage scalable ETL pipelines supporting a centralized data lake and Snowflake data warehouse. The role focuses on automating data ingestion, transformation, and aggregation workflows to enable reliable analytics and data-driven decision-making.

Key Responsibilities

  • Design, develop, and maintain robust ETL pipelines for ingesting data into the enterprise data lake and Snowflake environment.
  • Automate data processing, aggregation, and analytical workflows to improve data availability and performance.
  • Implement and manage orchestration and scheduling of data pipelines using Control$B!>(BM and Apache Airflow.
  • Develop scalable data transformation logic using PySpark and Apache Spark (Java).
  • Work with large, structured and semi-structured datasets on AWS infrastructure.
  • Ensure data quality, integrity, and reliability across data pipelines.
  • Optimize data pipelines for performance, cost, and scalability.
  • Collaborate with analytics, data science, and business teams to understand data requirements.
  • Monitor, troubleshoot, and resolve pipeline failures and performance bottlenecks.
  • Follow best practices for data engineering, security, and documentation.

Required Skills & Qualifications

  • Strong experience with data lake architectures and large-scale data processing.
  • Hands-on experience with AWS services (e.g., S3, EC2, EMR, Glue, or related).
  • Proven expertise in building ETL pipelines for analytics and reporting use cases.
  • Solid working knowledge of Snowflake, including data loading, transformations, and performance optimization.
  • Experience with workflow automation and scheduling tools such as Control$B!>(BM and Apache Airflow.
  • Proficiency in PySpark for distributed data processing.
  • Strong programming experience with Apache Spark using Java.
  • Good understanding of data modeling, partitioning, and performance tuning concepts.

Skills

AWS GlueAWSApache AirflowApache SparkControl-MEC2EMRETLJavaPySparkS3Snowflake

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free