Skip to content
mimi

Senior Data Engineer

Orakl Oncology

Villejuif · On-site Contract Senior Today

About the role

About Orakl Oncology

At Orakl Oncology, we are accelerating the development of oncology treatments. Today, fewer than 5% of new cancer drugs succeed in clinical trials, so new methods are needed. We combine cutting-edge biology and AI to build the next generation of insight platforms, powered by the world’s largest cohort of patient tumor avatars. These avatars fuel our AI-powered predictive engine, helping to anticipate clinical trial outcomes, validate therapies, and uncover new drug candidates. By generating multimodal, real-world data, we deliver insights that consistently outperform existing solutions.

Our mission is simple yet ambitious: to bring more effective treatments to patients who need them — and to make drug development smarter, faster, and more personalized. We collaborate with top hospitals, research institutes, and pharmaceutical companies worldwide. Backed by leading investors, we are a fast-growing, mission-driven startup at the intersection of science and technology.

About the Role

We generate data from multiple sources: our internal wet lab, partner hospitals, and sequencing providers. As we scale, our challenge is no longer data volume; it’s data clarity. Our datasets are rich but fragmented. Teams work across different levels of granularity (patient, sample, organoid, etc.), and we are looking for a Senior Data Engineer to own that layer.

You will design and maintain the data model that sits between our raw scientific data and the teams that need to act on it: clinical operations, wet lab scientists, and computational researchers. This is a highly operational, high-ownership role. You will be the central point of contact for data infrastructure, and your work will have immediate, visible impact on how we run trials, validate therapies, and ship new insights.

What You'll Do

Own the cross-functional data model Define the table structure, unique identifiers, and join logic that makes our data coherent across teams. Establish and enforce data conventions that work for both lab and operations teams.

Lead our LIMS integration Drive the technical integration of our laboratory information management system (LIMS) into the production environment — from schema design to data flow and migration from current systems.

Build the final layer of data curation Create user-facing views that merge clinical, experimental, and molecular data into clean, analysis-ready datasets accessible to non-technical users.

Bridge teams and priorities Act as the data liaison between clinical operations, wet lab scientists, omics specialists, and computational researchers. Translate diverse requirements into coherent data solutions and ensure roadmap alignment across teams.

Who You Are

• Solution-oriented: When faced with a non-technical problem, you can gather cross-functional needs and find the right engineering solution.

• Deeply curious: You proactively understand people’s bottlenecks and connect the dots between teams.

• A strong communicator: You translate technical concepts for non-technical audiences and are comfortable gathering requirements from scientists, clinicians, and engineers alike.

• Autonomous: You ship without waiting for perfect specifications. You own your scope end-to-end.

• Cross-functional by nature: You thrive in environments that require coordination across multiple teams with different priorities and vocabularies.

Minimum Qualifications

• MSc or Engineering degree in Computer Science, Data Science, or Computational Biology. Engineering background is a must.

• 4–5+ years of data engineering experience in an environment where data is originally messy and unstructured.

• Strong SQL skills and hands-on experience designing relational data models

• Proficiency in Python for data processing and pipeline development

• Experience with workflow orchestration tools (Airflow, Dagster, or similar)

• Comfortable with Git/GitHub in a collaborative engineering environment

Preferred Qualifications

• Experience in environments where scientists are primary data consumers

• Familiarity with biological, bioinformatics, or laboratory data systems (LIMS)

• Understanding of experimental workflows and research or clinical operations

Why Join Orakl

• Work at the intersection of data engineering and cutting-edge oncology research — on problems that directly advance patient care

• Own data infrastructure from the ground up in a fast-growing organization where your contributions are immediately visible

• Collaborate daily with clinical, wet lab, and computational teams in one of the most interdisciplinary environments in techbio

• Join a mission-driven team, backed by leading investors, with a culture built on curiosity, collaboration, and excellence

Interview Process

HR Call (15 min) — An initial discussion to understand your background, experience, and motivations.

Technical Interview (45 min) — A deep dive into your data engineering expertise and technical approach.

Technical Case (45 min) — A case study to assess your problem-solving approach and technical thinking. In-person when possible.

Founders Interview (45 min) — Meet our founders in person at our office. An opportunity to assess mutual cultural fit and alignment with our mission.

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free