TA
Research Engineer, Reinforcement Learning
techire ai
Millbrae · On-site Full-time Today
About the role
About
- Want to build the large-scale RL environments frontier labs use to train agents that can truly reason and act?
- This team are creating complex reinforcement learning environments - simulations where advanced agents learn to plan, adapt, and solve multi-step problems that stretch beyond standard benchmarks.
- The focus isn't on training the models themselves, but on building the worlds that make meaningful learning and evaluation possible - the foundation for more capable, aligned systems.
Responsibilities
- You'll work end-to-end across environment design, reward dynamics, and scalable simulation - developing the feedback loops that define what "good" looks like for intelligent behaviour.
- It's open-ended, research-driven work where the task definition, data, and reward structure are often the hardest and most important problems to solve.
- You'll collaborate closely with researchers tackling unsolved challenges in reinforcement learning and agent behaviour, shaping experiments, scaling infrastructure, and refining how agents learn in the loop.
Requirements
- It suits someone with strong ML and RL experience, deep intuition for agent dynamics, and the curiosity to explore problems that don't come with clear instructions.
Location
- On-site in San Francisco.
Compensation
- Compensation up to $300 K base (negotiable, depending on experience) plus equity.
Application
- If you want to help build the environments that teach the next generation of AI systems how to think, act, and adapt - we'd love to hear from you.
Requirements
- Strong ML and RL experience
- Deep intuition for agent dynamics
- Curiosity to explore problems that don't come with clear instructions
Responsibilities
- Work end-to-end across environment design, reward dynamics, and scalable simulation - developing the feedback loops that define what "good" looks like for intelligent behaviour.
- Collaborate closely with researchers tackling unsolved challenges in reinforcement learning and agent behaviour, shaping experiments, scaling infrastructure, and refining how agents learn in the loop.
Skills
MLRL
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free