Data Engineer
aurigin.ai
About the role
About Us:
Aurigin.ai is a Zurich-born startup on a mission to restore trust in digital communication by protecting high-stakes organizations from AI-generated voice fraud in real-time. Banks, government agencies, and enterprises use our deepfake detection to stop account takeovers, CEO impersonations, and fake news before they cause damage.
We're a team of passionate innovators from top-tier tech and consulting firms as well as startups. We thrive on collaboration, embrace a "move fast to launch and iterate" mentality, and are united by a shared vision of a safer, more transparent world in the age of AI.
Why we need you:
Our real-time deepfake detection engine is trusted by high-stakes clients, and high-quality, diverse data is the asset that keeps it ahead of emerging fraud. We need a Data Engineer to own the audio data lifecycle end to end: sourcing, ingesting, structuring, and managing terabytes of noisy, real-world audio so our ML team can focus on modeling. The role has a clear data engineering core, with room to grow into the acoustic side (applying augmentations like impulse responses and environmental noise) as you build expertise.
What you will do:
- Architect at scale: Design and maintain robust ETL pipelines and storage architectures for hundreds of TBs of audio data.
- Wrangle and validate: Clean, format, and structure raw audio for downstream use, and use statistics and acoustic checks to catch anomalies and validate dataset quality before it hits training.
- Own provenance, labeling, and licensing: Track where every sample came from, under what license, and with what labels. This is core to our model quality.
- Explore audio augmentation: Partner with the team to source, synthesize, and enrich datasets with realistic augmentations (room acoustics, impulse responses, background noise).
About you:
- Experience: 2+ years building and managing large-scale data pipelines and databases.
- Tech stack: Strong Python and SQL.
- Cloud and orchestration: Hands-on experience with at least one major cloud platform (AWS, GCP, or Azure) and at least one orchestration tool (Airflow, Prefect, Dagster, or similar). We're not dogmatic about which. Knowledge transfers.
- Statistical foundation: Solid grasp of basic statistics for measuring data quality.
- Audio fluency: Comfortable working with audio data in Python (e.g. torchaudio) and familiar with the basics: sample rates, channels, common formats, SNR.
- Ownership mindset: Self-driven, loves solving complex problems, and can run end-to-end projects in a fast-paced startup.
- Be a good human: Contribute to a positive and fun team environment.
- Languages: Fluent in English. Any other language is a plus.
Nice-to-have:
- Hands-on experience with audio augmentation libraries (audiomentations, pyroomacoustics, WavAugment) or working with RIR and noise datasets.
- Solid acoustic principles: impulse responses, reverberation, room acoustics, psychoacoustics.
- Prior experience working closely with ML or AI teams.
Benefits:
- Make a real impact: Join a company working with cutting-edge AI technologies and fighting the good fight.
- Be part of something big: Help protect the world from AI-generated chaos and shape our product/culture from the start.
- Work with cool people: Our team is smart, passionate, and driven.
- Flexibility: Work from wherever you're most productive and collaborative (2 days/week minimum in Zurich office).
- Competitive salary and equity: Meaningful ownership stake in the company. CHF ~80k salary + ~CHF 20k in stock options.
If you're ready to use your skills for good and want to be part of a team from the start that's making a real difference, we want to hear from you!
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free