NA
ETL Test Engineer
Noname Admin Services Inc.
Toronto · Hybrid Contract $40 – $45/hr Today
About the role
Job Summary
We are seeking a detail-oriented ETL Test Engineer with strong expertise in PySpark, Advanced SQL, and AWS cloud services to validate large-scale data pipelines. The role focuses on ensuring data accuracy, integrity, and performance across distributed data processing systems in cloud-based architectures.
Key Responsibilities
- Design and execute ETL testing strategies for data pipelines built using PySpark on AWS
- Validate data transformations using advanced SQL queries across large datasets
- Perform source-to-target data validation (RDBMS → AWS Data Lake → Data Warehouse)
- Develop and maintain automated data validation frameworks using PySpark
- Validate data pipelines running on AWS services such as:
- AWS Glue (ETL jobs)
- Amazon S3 (data lake storage)
- Amazon Redshift (data warehouse)
- Perform data quality checks (nulls, duplicates, schema validation, referential integrity)
- Execute batch and near real-time pipeline validation
- Conduct regression and integration testing for ETL workflows
- Debug and analyze large datasets using PySpark and SQL
- Validate data ingestion pipelines (files, APIs, streaming sources)
- Collaborate with data engineers and architects on cloud-based data solutions
- Ensure compliance with data governance, security, and AWS best practices
Required Skills
Technical Skills:
- Strong hands-on experience in PySpark
- DataFrames, transformations, joins, aggregations
- Advanced SQL expertise:
- Complex joins, window functions, CTEs, subqueries
- Solid experience in ETL / Data Testing
- Hands-on experience with AWS services:
- AWS Glue (job validation)
- Amazon S3 (data validation in buckets)
- Amazon Redshift (data warehouse validation)
- Strong understanding of data warehousing concepts (fact/dimension, star schema)
- Experience in large dataset validation and reconciliation
Preferred Skills:
- Experience with:
- AWS Lambda (event-driven processing)
- Amazon EMR (Spark clusters)
- AWS Step Functions (workflow orchestration)
- Familiarity with Databricks on AWS
- Knowledge of automation frameworks (PyTest, unittest)
- Experience with CI/CD pipelines (Jenkins, GitHub Actions, AWS CodePipeline)
- Exposure to API and streaming data testing (Kafka, Kinesis)
- Basic Python scripting beyond PySpark
Testing-Specific Expertise:
- ETL test planning, design, and execution
- Data reconciliation techniques across distributed systems
- Handling Slowly Changing Dimensions (SCD)
- Data lineage and impact analysis
- Schema validation and evolution testing
- Performance testing for large-scale data processing
Job Details
- Job Type: Fixed term contract
- Contract length: 12 months
- Pay: $40.00-$45.00 per hour
- Expected hours: 40 per week
- Work Location: Hybrid remote in Toronto, ON (Peel District)
Skills
AWS CodePipelineAWS EMRAWS GlueAWS LambdaAWS Step FunctionsAmazon RedshiftAmazon S3DatabricksGitHub ActionsJenkinsKafkaKinesisPySparkPyTestSQLunittest
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free