Skip to content
mimi

AI ML Performance Engineer

VDart

Bellevue · Hybrid Contract 1mo ago

About the role

About the Role

We are looking for an experienced AI/ML Performance Engineer to design and execute high-intensity stress workloads for next-generation AI platforms. This role focuses on identifying performance bottlenecks, improving system stability, and enabling scalable, production-ready AI infrastructure.

Key Responsibilities

  • Design and implement high-intensity stress workloads using PyTorch and Triton
  • Analyze system performance to identify bottlenecks, stability issues, and performance cliffs
  • Develop workloads targeting large GEMMs, attention mechanisms, MoE-like architectures, mixed precision, and long-running executions
  • Build custom Triton kernels to stress hardware execution units, memory hierarchies, and synchronization paths
  • Create scalable test harnesses across problem size, number of devices, and runtime duration
  • Integrate workloads with profiling, monitoring, and failure triage tools
  • Collaborate with platform, firmware, and SDK teams
  • Provide documentation and reproducible scripts for lab and CI environments

Required Skills

  • Strong experience in performance testing and analysis (test result analysis, server stats, bottleneck identification, tuning, and recommendations)
  • Proficiency in Python
  • Scripting experience using Shell or PowerShell
  • Experience with PyTorch and/or Triton

Nice to Have

  • Experience with AI hardware platforms or simulators
  • Exposure to distributed systems and multi-device workloads

Skills

PythonPyTorchTriton

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free