AI ML Performance Engineer

VDart

Bellevue · Hybrid Contract 2mo ago

About the role

About the Role

We are looking for an experienced AI/ML Performance Engineer to design and execute high-intensity stress workloads for next-generation AI platforms. This role focuses on identifying performance bottlenecks, improving system stability, and enabling scalable, production-ready AI infrastructure.

Key Responsibilities

Design and implement high-intensity stress workloads using PyTorch and Triton
Analyze system performance to identify bottlenecks, stability issues, and performance cliffs
Develop workloads targeting large GEMMs, attention mechanisms, MoE-like architectures, mixed precision, and long-running executions
Build custom Triton kernels to stress hardware execution units, memory hierarchies, and synchronization paths
Create scalable test harnesses across problem size, number of devices, and runtime duration
Integrate workloads with profiling, monitoring, and failure triage tools
Collaborate with platform, firmware, and SDK teams
Provide documentation and reproducible scripts for lab and CI environments

Required Skills

Strong experience in performance testing and analysis (test result analysis, server stats, bottleneck identification, tuning, and recommendations)
Proficiency in Python
Scripting experience using Shell or PowerShell
Experience with PyTorch and/or Triton

Nice to Have

Experience with AI hardware platforms or simulators
Exposure to distributed systems and multi-device workloads

Skills

PythonPyTorchTriton

Similar roles

backend developer

skoobe

Solutions Engineer

Attio

$110k – $130k/yr

Backend Engineer (Bangalore)

Runable

₹2500k – ₹4500k/yr

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free