Skip to content
mimi

Data Engineer

aurigin.ai

flexible Full-time Mid Level From CHF 80k/yr 2w ago

About the role

About Us:

Aurigin.ai is a Zurich-born startup on a mission to restore trust in digital communication by protecting high-stakes organizations from AI-generated voice fraud in real-time. Banks, government agencies, and enterprises use our deepfake detection to stop account takeovers, CEO impersonations, and fake news before they cause damage.

We're a team of passionate innovators from top-tier tech and consulting firms as well as startups. We thrive on collaboration, embrace a "move fast to launch and iterate" mentality, and are united by a shared vision of a safer, more transparent world in the age of AI.

Why we need you:

Our real-time deepfake detection engine is trusted by high-stakes clients, and high-quality, diverse data is the asset that keeps it ahead of emerging fraud. We need a Data Engineer to own the audio data lifecycle end to end: sourcing, ingesting, structuring, and managing terabytes of noisy, real-world audio so our ML team can focus on modeling. The role has a clear data engineering core, with room to grow into the acoustic side (applying augmentations like impulse responses and environmental noise) as you build expertise.

What you will do:

  • Architect at scale: Design and maintain robust ETL pipelines and storage architectures for hundreds of TBs of audio data.
  • Wrangle and validate: Clean, format, and structure raw audio for downstream use, and use statistics and acoustic checks to catch anomalies and validate dataset quality before it hits training.
  • Own provenance, labeling, and licensing: Track where every sample came from, under what license, and with what labels. This is core to our model quality.
  • Explore audio augmentation: Partner with the team to source, synthesize, and enrich datasets with realistic augmentations (room acoustics, impulse responses, background noise).

About you:

  • Experience: 2+ years building and managing large-scale data pipelines and databases.
  • Tech stack: Strong Python and SQL.
  • Cloud and orchestration: Hands-on experience with at least one major cloud platform (AWS, GCP, or Azure) and at least one orchestration tool (Airflow, Prefect, Dagster, or similar). We're not dogmatic about which. Knowledge transfers.
  • Statistical foundation: Solid grasp of basic statistics for measuring data quality.
  • Audio fluency: Comfortable working with audio data in Python (e.g. torchaudio) and familiar with the basics: sample rates, channels, common formats, SNR.
  • Ownership mindset: Self-driven, loves solving complex problems, and can run end-to-end projects in a fast-paced startup.
  • Be a good human: Contribute to a positive and fun team environment.
  • Languages: Fluent in English. Any other language is a plus.

Nice-to-have:

  • Hands-on experience with audio augmentation libraries (audiomentations, pyroomacoustics, WavAugment) or working with RIR and noise datasets.
  • Solid acoustic principles: impulse responses, reverberation, room acoustics, psychoacoustics.
  • Prior experience working closely with ML or AI teams.

Benefits:

  • Make a real impact: Join a company working with cutting-edge AI technologies and fighting the good fight.
  • Be part of something big: Help protect the world from AI-generated chaos and shape our product/culture from the start.
  • Work with cool people: Our team is smart, passionate, and driven.
  • Flexibility: Work from wherever you're most productive and collaborative (2 days/week minimum in Zurich office).
  • Competitive salary and equity: Meaningful ownership stake in the company. CHF ~80k salary + ~CHF 20k in stock options.

If you're ready to use your skills for good and want to be part of a team from the start that's making a real difference, we want to hear from you!

Skills

AWSAzureGCPPythonSQLAirflowDagsterPrefecttorchaudio

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free