All jobs · Machine Learning Engineer jobs

Multimodal AI Engineer

Studio Jadu

Zürich · On-site 1mo ago

About the role

We’re looking for an engineer to design, build, and improve the core AI workflows behind our product.

This is not a traditional ML role. The work is centered on LLMs, VLMs, and image/video/audio generation models used as part of real production workflows. We’re looking for someone who has strong hands-on experience building systems around these models, evaluating them in practice, and improving quality, reliability, and efficiency over time.

What you’ll do

Design and build end-to-end workflows powered by LLMs, VLMs, and multimodal generation models
Integrate, manage, and benchmark models for text, image, audio, and video
Run experiments on prompts, system prompts, model configurations, and inference pipelines
Build evaluation frameworks using human review, automated benchmarks, and LLM-as-judge style approaches where appropriate
Analyze model behavior and failure modes, and turn findings into better prompts, better routing, and better workflows
Develop scoring, ranking, and recommendation layers for multimodal outputs
Build APIs and internal tools that make these systems reusable, reproducible, and efficient

What we’re looking for

Proven hands-on experience building applications or internal systems where LLMs, VLMs, or generative media models were central
Strong understanding of how these models behave in practice, including prompting, evaluation, reliability, and cost/latency tradeoffs
Experience working with image, video, and/or audio generation models, including evaluating output quality and deciding what works in production
Strong Python skills and solid software engineering fundamentals
Ability to design experiments and iterate quickly based on evidence

Nice to have

Experience fine-tuning or training LLMs/VLMs
Experience with multimodal retrieval, ranking, or orchestration systems
Experience building human-in-the-loop workflows for creative tools

About us

We are a startup at the intersection of AI and media. We are building the next generation of tools that help creatives transform stories into production-ready video and reach their fans. Our mission is to be at the service of good stories.

Skills

LLMPythonVLM

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Multimodal AI Engineer

About the role

What you’ll do

What we’re looking for

Nice to have

About us

Skills

Similar roles

Cloud Data Engineer

Azure Cloud Platform Engineer

Junior Software Developer / Architekt

Don't send a generic resume