Senior Data Engineer
DataTech Recruitment
About the role
Senior Data Engineer job vacancy in Cape Town.
Build serious data systems that actually matter. This is a high-impact senior role for a data engineer who knows how to get the best out of Spark, writes strong Python, and enjoys turning messy legacy pipelines into clean, scalable engineering.
You will join a fast-moving team working on modern cloud data platforms, lakehouse architecture, and large-scale processing where performance, quality, and good engineering judgment count.
The core tech includes Spark, PySpark, Python, Delta Lake, Parquet, Azure Synapse, SQL, Docker, and modern orchestration approaches.
Salary: R80 000 per month CTC.
Type: Remote.
Key Responsibilities:
• Design, build, and optimise high-performance data pipelines using Python and PySpark
• Improve Spark workloads through better memory use, partitioning, shuffle tuning, and DAG optimisation
• Refactor legacy SQL-heavy ETL processes into modular, reusable Python libraries
• Build and maintain lakehouse data layers across Bronze, Silver, and Gold
• Work with Delta Lake and Parquet to improve versioning, schema management, and storage performance
• Help drive a code-first approach to orchestration and reduce reliance on cloud-specific tooling
• Support a cloud-agnostic engineering approach with portable, scalable solutions
• Contribute to code reviews, testing standards, and overall platform quality
• Partner with analysts, data scientists, and business teams to deliver practical data solutions
• Mentor junior engineers and help shape strong engineering standards across the team
Requirements:
• Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field
• 6+ years of experience working with Spark or PySpark in production environments
• Strong Python skills, with experience building maintainable, production-grade applications
• Proven ability to identify and fix Spark performance bottlenecks using the Spark UI
• Solid SQL skills, including the ability to interpret and migrate existing ETL logic
• Experience with Azure Synapse Analytics, Dedicated SQL Pools, and Data Factory
• Strong hands-on experience with Delta Lake and Parquet in high-volume environments
• Experience with Docker and open-source or portable engineering standards
• Strong understanding of scalable data architecture and modern data engineering best practice
• Experience working in collaborative engineering teams on complex data platforms
• Ability to work across technical and non-technical teams and communicate clearly
• A strong grasp of security, compliance, and data governance in data engineering environments
If you are a senior data engineer who wants to own performance, shape modern data platforms, and work with a strong technical stack, apply now.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free