Senior LLM Engineer
Stealth Labs
About the role
What we are building
We are building an applied research team in Marseille to work on a single, hard problem: can a software platform be genuinely autonomous — one where LLM agents orchestrate infrastructure, diagnose and repair failures, deploy new components, and compose interfaces, with no human in the loop for routine operations!
We are researching and engineering the foundations with minimal dependencies.
The role
We are looking for someone who lives at the full depth of the LLM stack — from prompt to parameter. Someone who can wire a frontier model into a production agentic pipeline in the morning, run fine-tuning experiments in the afternoon, and spend the evening reading the paper that will matter in six months.
We need both in one person: the engineer who understands what a model is doing internally, and the practitioner who knows how to make it do something useful reliably.
You will be the person the rest of the team turns to when a model behaves unexpectedly, when a new architecture drops and we need to know if it changes anything for us, and when the gap between what a model can do in a demo and what it can do in production needs to be closed.
What you will own
Applied LLM integration
- Design and operate the model integration layer across the platform: agent orchestration, natural language control plane, autonomous code generation, and UI composition
- Build robust inference pipelines that are observable, debuggable, and recoverable when models produce unexpected output
- Design prompting strategies, tool use schemas, and structured output contracts that hold under production conditions
- Own the reliability of model-dependent features end to end — not just the happy path
Model customization and evaluation
- Run fine-tuning, instruction tuning, and alignment experiments on open models for platform-specific tasks
- Design and maintain rigorous evaluation suites: benchmark selection, dataset construction, automated and human evaluation pipelines
- Identify capability gaps between frontier models and what the platform actually requires, and close them
- Manage the full model lifecycle: experimentation, validation, deployment, monitoring, and replacement
Research and technical horizon
- Track the LLM research and translate relevant advances into platform decisions — architecture changes, new training techniques, emerging inference methods
- Maintain a working understanding of model internals: attention mechanisms, tokenisation, training dynamics, scaling behaviour, emergent capabilities
- Assess new models and techniques critically — distinguish genuine advances from benchmark overfitting
Skills
Technical
- LLM internals
- Training and fine-tuning
- Inference engineering
- Agentic systems
- Evaluation methodology
- GPU infrastructure
We are looking for someone who:
- can explain why a model is behaving a certain way, not just observe that it is
- treats evaluation as seriously as training — a result without a rigorous eval is not a result
- has strong opinions about prompting that you are willing to revise when evidence contradicts them
- reads papers the week they drop and form your own view before the community consensus forms
- writes code that a colleague can read, reproduce, and build on — experiments are not a licence for chaos
- is honest about the limits of what a model can do reliably and design systems accordingly
We are not looking for someone who:
- considers prompt engineering beneath them and fine-tuning the only real work
- considers fine-tuning unnecessary because frontier models can do everything with the right prompt
- has never shipped anything to production
- has never read a paper and does not intend to start
- doesn’t like to spread their knowledge proactively
Practical details
- Location Marseille, France
- Contract Consulting leading to full time
- Seniority 10+ years, PhD a plus
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free