SC
Data Engineer
Swartek Corporation
US · Hybrid Full-time Senior 1mo ago
About the role
About the Role
We are looking for a Data Engineer to support modernization of large-scale data processing platforms on AWS. The role focuses on optimizing and supporting migration of Spark-based workloads (EMR to EMR on EKS), improving performance, scalability, and cost efficiency.
You will work closely with data, platform, and DevOps teams to enhance reliability and efficiency of data pipelines.
Key Responsibilities
- Design, build, and optimize Spark-based data pipelines (batch and/or streaming)
- Work with AWS services such as EMR, S3, IAM, and CloudWatch
- Support migration of workloads from EMR (EC2) to EMR on EKS
- Optimize Spark jobs for performance (memory, partitioning, resource utilization)
- Troubleshoot production issues and improve pipeline reliability
- Collaborate with DevOps teams on CI/CD and infrastructure automation
- Contribute to monitoring, logging, and operational best practices
Required Qualifications
- 7+ years of experience in Data Engineering or related field
- Strong experience with Apache Spark (PySpark/Scala) and SQL
- Hands-on experience with AWS (EMR, S3, IAM, CloudWatch)
- Working knowledge of Kubernetes (EKS or similar)
- Experience building and maintaining data pipelines in production environments
- Hands-on experience optimizing Spark jobs in production (performance tuning, resource optimization)
- Exposure to Kubernetes-based execution environments (EKS or similar)
- Strong problem-solving and debugging skills
Preferred Qualifications
- Experience with EMR on EKS or Spark on Kubernetes
- Familiarity with Hive or Spark SQL
- Experience with Airflow or similar orchestration tools
- Knowledge of infrastructure-as-code tools (Terraform, CDK)
- Exposure to lakehouse technologies (Iceberg, Hudi) or streaming frameworks
- Experience working with large-scale data processing (GB–TB level datasets)
Ideal candidate: Strong Spark + AWS engineer with some Kubernetes exposure and experience tuning production workloads.
Skills
AWSAWS CloudWatchAWS EMRAWS IAMAWS S3Apache SparkCD/CICloudWatchDevOpsIAMKubernetesPySparkS3ScalaSparkSpark SQLSQLTerraform
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free