Multimodal AI Engineer
Studio Jadu
About the role
We’re looking for an engineer to design, build, and improve the core AI workflows behind our product.
This is not a traditional ML role. The work is centered on LLMs, VLMs, and image/video/audio generation models used as part of real production workflows. We’re looking for someone who has strong hands-on experience building systems around these models, evaluating them in practice, and improving quality, reliability, and efficiency over time.
What you’ll do
- Design and build end-to-end workflows powered by LLMs, VLMs, and multimodal generation models
- Integrate, manage, and benchmark models for text, image, audio, and video
- Run experiments on prompts, system prompts, model configurations, and inference pipelines
- Build evaluation frameworks using human review, automated benchmarks, and LLM-as-judge style approaches where appropriate
- Analyze model behavior and failure modes, and turn findings into better prompts, better routing, and better workflows
- Develop scoring, ranking, and recommendation layers for multimodal outputs
- Build APIs and internal tools that make these systems reusable, reproducible, and efficient
What we’re looking for
- Proven hands-on experience building applications or internal systems where LLMs, VLMs, or generative media models were central
- Strong understanding of how these models behave in practice, including prompting, evaluation, reliability, and cost/latency tradeoffs
- Experience working with image, video, and/or audio generation models, including evaluating output quality and deciding what works in production
- Strong Python skills and solid software engineering fundamentals
- Ability to design experiments and iterate quickly based on evidence
Nice to have
- Experience fine-tuning or training LLMs/VLMs
- Experience with multimodal retrieval, ranking, or orchestration systems
- Experience building human-in-the-loop workflows for creative tools
About us
We are a startup at the intersection of AI and media. We are building the next generation of tools that help creatives transform stories into production-ready video and reach their fans. Our mission is to be at the service of good stories.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free