Senior Research Scientist

Ivoclar Vivadent Manufacturing GmbH

On-site Senior 1mo ago

About the role

Your role in the team

At Canva, our mission is to empower the world to design. We’re building AI that feels magical and lands real impact for millions of people - helping anyone create with confidence.

We’re looking for a senior research scientist who lives and breathes reinforcement learning, agentic systems and mixture of expert models to push the frontier of reasoning, tool use, latency and reliability - and ship it to users.

You will steer research directions and take a leading role in hands-on work across the agent stack—from reward design and policy optimization to planning, memory, and tool orchestration, dataset construction, to post-training, and the development of novel post-training approaches.

You will design precise experiments, iterate rapidly, and arrive at reliable conclusions.

Most importantly, you’ll help convert research into reliable, safe, and high‑quality product experiences.

Responsibilities

Develop agent systems (planning, multimodal tool use, retrieval, novel training approaches, modeling ablations) for real tasks in design, vision, and language.
Scale post-training and RL across distributed systems (PyTorch) with efficient data loaders, tracing/telemetry, stable training of mixture-of-experts (MoE) architectures, and reproducible pipelines; profile, debug, and optimize.
Contribute to the research agenda for RL/agentic systems aligned with Canva’s product goals; identify high‑leverage bets and retire dead ends quickly.
Build reward models and learning loops: RLHF/RLAIF, preference modeling, DPO/IPO‑style objectives, offline/online RL, curriculum learning, and credit assignment for multi‑step reasoning.
Develop simulation and sandbox tasks that surface failure modes (planning errors, tool‑use brittleness, hallucination, unsafe actions) and turn them into measurable targets.
Help align on rigorous evaluation for agents (task success, reliability, latency, safety, regressions).
Set up offline suites and online A/B tests; favor simple, controlled experiments that generalize.
Collaborate and ship: work shoulder‑to‑shoulder with product, design, safety, and platform to land research as reliable features—then iterate.
Share and elevate: mentor teammates, present findings internally, and contribute back to the community when it helps the field and our users.

What we offer

Achieving our crazy big goals motivates us to work hard - and we do - but you'll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too.

We also offer a stack of benefits to set you up for every success in and outside of work.

Here's a taste of what's on offer:

Equity packages - we want our success to be yours too.
Inclusive parental leave policy that supports all parents & carers.
An annual Vibe & Thrive allowance to support your wellbeing, social connection, home office setup & more.
Flexible leave options that empower you to be a force for good, take time to recharge and support you personally.

Technologies and skills

Python
PyTorch

Our expectations:

Qualifications

Depth in implementing and post-training MoEs/LLMs/VLMs/Diffusion models, with a track record of shipped research or publications in MoEs, RL or agents.
Fluency in Python and PyTorch; you’re comfortable in large ML codebases and can profile, debug, and optimize training and inference.

Experience

Experience modifying and adapting open-source models.
Starke Erfahrung im experimentellen Design: enge Baselines, saubere Ablationen, Reproduzierbarkeit und klare, datenbasierte Schlussfolgerungen.
Practical experience building agent loops (planning, tool invocation, retrieval, memory) and evaluating multi-step reasoning quality.
Hands-on experience with policy optimization, reward modeling, and preference learning (e.g., RLHF/RLAIF, DPO/IPO, actor-critic/PPO, offline RL).
Experience with large‑scale training (distributed training, experiment tracking, evaluation harnesses) and cloud multimodal tooling.
Experience with RL for MoE architectures.

Benefits

Mental Health Care
Fresh Fruit
Relaxation Rooms
Meal Vouchers
Excellent Traffic Connections
Tabletop Soccer, etc.
Health Care Benefits
Employee Stock Option
Snacks, Sweets
Coffee, Tea, etc.
Flexible Working Hours
Public Transport Allowance
No All-In Contracts
Bicycle Parking Space
Company Notebook for Private Use

Skills

PythonPyTorch

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free