Skip to content
mimi

Senior Data Engineer with PySpark and Microservices

Jobs via Dice

Sinking Spring · Hybrid Full-time Senior Today

About the role

About

Dice is the leading career destination for tech experts at every stage of their careers. Our client, E‑Solutions, Inc., is seeking the following. Apply via Dice today!

Role

Senior Data Engineer with PySpark and Microservices

Location

Reading, PA (Hybrid)

Employment Type

C2H or Fulltime

Job Summary

We are seeking a highly skilled Senior Data Engineer to design, build, and operate scalable, high‑performance data platforms. This is a hands‑on engineering role requiring deep expertise in PySpark, Python Microservices, and Python programming, along with modern data lake technologies such as Apache Iceberg. The ideal candidate will work closely with data architects and platform leads to implement reliable batch and streaming data pipelines on AWS that support analytics and business‑critical applications.

Key Responsibilities

  • Hands‑On Data Engineering
    • Design, develop, and maintain large‑scale batch and streaming data pipelines using PySpark and .
    • Write production‑grade Python code for complex data transformations, validations, and business logic.
    • Implement efficient processing of high‑volume, high‑velocity data across distributed systems.
  • Streaming & Real‑Time Processing
    • Build and operate real‑time and near real‑time data pipelines using .
    • Implement stateful processing, windowing, checkpointing, and fault‑tolerant streaming applications.
    • Ensure low‑latency and high‑throughput streaming solutions.
  • Data Lake & Iceberg
    • Design and manage data lake architectures using Apache Iceberg on cloud storage (S3).
    • Implement Iceberg capabilities such as schema evolution, partitioning, compaction, and time travel.
    • Optimize read and write performance for large Iceberg tables.
  • Cloud & Data Integration
    • Design, develop, and deploy data pipelines on AWS using services such as S3, EMR, Glue, Lambda, Athena, Redshift, and Aurora (MySQL/PostgreSQL) for data ingestion, processing, and analytics.
  • Performance, Reliability & Operations
    • Tune Spark jobs for performance, scalability, and cost efficiency.
    • Troubleshoot and resolve complex production issues in distributed data systems.
    • Implement monitoring, alerting, logging, and recovery strategies for data pipelines.
  • Engineering Excellence & Collaboration
    • Write clean, testable, and maintainable code following engineering best practices.
    • Contribute to CI/CD pipelines for data engineering workloads.
    • Participate in code reviews, technical design discussions, and architecture reviews.

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • Strong expertise in PySpark and Microservices and Spark SQL for large‑scale data processing
  • Strong expertise in , including streaming fundamentals and stateful processing.
  • Hands‑on experience building and running Kafka‑based streaming applications in production environments.
  • Advanced proficiency in Python for building scalable, production‑grade data solutions.
  • Hands‑on experience with Apache Iceberg in production environments.
  • Solid experience with AWS data services (S3, EMR, Glue, Lambda, Redshift).
  • Advanced SQL skills and strong understanding of data modeling and data lake architectures.

Contact

Prisca
Account Manager, E‑Solutions
m: +1
w:
e: Prisca

Disclaimer

“Disclaimer: E‑Solutions Inc. provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws. We especially invite women, minorities, veterans, and individuals with disabilities to apply. EEO/AA/M/F/Vet/Disability.”

Additional Tags

Senior Data Engineer with PySpark and Microservices1Microservices,Python,Pyspark,Data EngineerN/AC2C,W-2,1099,C2H,Part Time,Full Time,Other,Intern,Pass Through,Contract,SOLUTIONS,Hourly,Contract to Perm,W2,W2 - Profit Sharing,Sub-con,Permanent,Full TImeUnited States

Requirements

  • We are seeking a highly skilled Senior Data Engineer to design, build, and operate scalable, high-performance data platforms
  • This is a hands-on engineering role requiring deep expertise in PySpark, Python Microservices, and Python programming, along with modern data lake technologies such as Apache Iceberg
  • Design and manage data lake architectures using Apache Iceberg on cloud storage (S3)
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
  • Strong expertise in PySpark and Microservices and Spark SQL for large-scale data processing
  • Strong expertise in , including streaming fundamentals and stateful processing
  • Hands-on experience building and running Kafka-based streaming applications in production environments
  • Advanced proficiency in Python for building scalable, production-grade data solutions
  • Hands-on experience with Apache Iceberg in production environments
  • Solid experience with AWS data services (S3, EMR, Glue, Lambda, Redshift)
  • Advanced SQL skills and strong understanding of data modeling and data lake architectures
  • Senior Data Engineer with PySpark and Microservices1Microservices,Python,Pyspark,Data EngineerN/AC2C,W-2,1099,C2H,Part Time,Full Time,Other,Intern,Pass Through,Contract,SOLUTIONS,Hourly,Contract to Perm,W2,W2 - Profit Sharing,Sub-con,Permanent,Full TImeUnited States

Responsibilities

  • The ideal candidate will work closely with data architects and platform leads to implement reliable batch and streaming data pipelines on AWS that support analytics and business-critical applications
  • Hands-On Data Engineering
  • Design, develop, and maintain large-scale batch and streaming data pipelines using PySpark and
  • Write production-grade Python code for complex data transformations, validations, and business logic
  • Implement efficient processing of high-volume, high-velocity data across distributed systems
  • Streaming & Real-Time Processing
  • Build and operate real-time and near real-time data pipelines using
  • Implement stateful processing, windowing, checkpointing, and fault-tolerant streaming applications
  • Ensure low-latency and high-throughput streaming solutions
  • Data Lake & Iceberg
  • Implement Iceberg capabilities such as schema evolution, partitioning, compaction, and time travel
  • Optimize read and write performance for large Iceberg tables
  • Cloud & Data Integration
  • Design, develop, and deploy data pipelines on AWS using services such as S3, EMR, Glue, Lambda, Athena, Redshift, and Aurora (MySQL/PostgreSQL) for data ingestion, processing, and analytics
  • Performance, Reliability & Operations
  • Tune Spark jobs for performance, scalability, and cost efficiency
  • Troubleshoot and resolve complex production issues in distributed data systems
  • Implement monitoring, alerting, logging, and recovery strategies for data pipelines
  • Engineering Excellence & Collaboration
  • Write clean, testable, and maintainable code following engineering best practices
  • Contribute to CI/CD pipelines for data engineering workloads
  • Participate in code reviews, technical design discussions, and architecture reviews

Skills

AWSAWS GlueAWS LambdaAWS RedshiftAWS S3Apache IcebergAuroraEMRKafkaMicroservicesMySQLPostgreSQLPythonPySparkSpark SQL

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free