Skip to content
mimi

ETL Test Engineer

Noname Admin Services Inc.

Toronto · Hybrid Contract $40 – $45/hr Today

About the role

Job Summary

We are seeking a detail-oriented ETL Test Engineer with strong expertise in PySpark, Advanced SQL, and AWS cloud services to validate large-scale data pipelines. The role focuses on ensuring data accuracy, integrity, and performance across distributed data processing systems in cloud-based architectures.

Key Responsibilities

  • Design and execute ETL testing strategies for data pipelines built using PySpark on AWS
  • Validate data transformations using advanced SQL queries across large datasets
  • Perform source-to-target data validation (RDBMS → AWS Data Lake → Data Warehouse)
  • Develop and maintain automated data validation frameworks using PySpark
  • Validate data pipelines running on AWS services such as:
    • AWS Glue (ETL jobs)
    • Amazon S3 (data lake storage)
    • Amazon Redshift (data warehouse)
  • Perform data quality checks (nulls, duplicates, schema validation, referential integrity)
  • Execute batch and near real-time pipeline validation
  • Conduct regression and integration testing for ETL workflows
  • Debug and analyze large datasets using PySpark and SQL
  • Validate data ingestion pipelines (files, APIs, streaming sources)
  • Collaborate with data engineers and architects on cloud-based data solutions
  • Ensure compliance with data governance, security, and AWS best practices

Required Skills

Technical Skills:

  • Strong hands-on experience in PySpark
    • DataFrames, transformations, joins, aggregations
  • Advanced SQL expertise:
    • Complex joins, window functions, CTEs, subqueries
  • Solid experience in ETL / Data Testing
  • Hands-on experience with AWS services:
    • AWS Glue (job validation)
    • Amazon S3 (data validation in buckets)
    • Amazon Redshift (data warehouse validation)
  • Strong understanding of data warehousing concepts (fact/dimension, star schema)
  • Experience in large dataset validation and reconciliation

Preferred Skills:

  • Experience with:
    • AWS Lambda (event-driven processing)
    • Amazon EMR (Spark clusters)
    • AWS Step Functions (workflow orchestration)
  • Familiarity with Databricks on AWS
  • Knowledge of automation frameworks (PyTest, unittest)
  • Experience with CI/CD pipelines (Jenkins, GitHub Actions, AWS CodePipeline)
  • Exposure to API and streaming data testing (Kafka, Kinesis)
  • Basic Python scripting beyond PySpark

Testing-Specific Expertise:

  • ETL test planning, design, and execution
  • Data reconciliation techniques across distributed systems
  • Handling Slowly Changing Dimensions (SCD)
  • Data lineage and impact analysis
  • Schema validation and evolution testing
  • Performance testing for large-scale data processing

Job Details

  • Job Type: Fixed term contract
  • Contract length: 12 months
  • Pay: $40.00-$45.00 per hour
  • Expected hours: 40 per week
  • Work Location: Hybrid remote in Toronto, ON (Peel District)

Skills

AWS CodePipelineAWS EMRAWS GlueAWS LambdaAWS Step FunctionsAmazon RedshiftAmazon S3DatabricksGitHub ActionsJenkinsKafkaKinesisPySparkPyTestSQLunittest

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free