Skip to content
mimi

AI / Infrastructure Engineer

Harrison Clarke

San Francisco · On-site Full-time Mid Level 2w ago

About the role

Harrison Clarke are partnered with an early-stage startup building ground truth infrastructure for AI agents - creating the data, evaluation, and runtime systems that allow LLM-powered agents to behave reliably in real-world environments.

As an AI / Infrastructure Engineer focused on LLM systems, you will help design and operate the production backbone for deploying and scaling large language models. This includes building low-latency inference systems, GPU-optimised serving infrastructure, and the evaluation pipelines that ensure model outputs remain accurate, consistent, and grounded.

Key Responsibilities

  • Design and operate infrastructure for deploying LLMs (e.g., GPT-style, open-weight, fine-tuned models)
  • Build and optimise high-throughput, low-latency inference pipelines
  • Implement scalable LLM serving systems (batching, caching, streaming, request routing)
  • Manage GPU-based infrastructure with a focus on cost and performance efficiency
  • Deploy and maintain model serving stacks (e.g., vLLM, TensorRT-LLM, TGI, Triton, or equivalents)
  • Build systems for model routing, fallback logic, and multi-model orchestration
  • Implement observability for LLM systems (latency, throughput, cost, failure modes, quality signals)
  • Design evaluation infrastructure for production LLM behaviour (A/B testing, regression testing, drift detection)
  • Collaborate with ML and product teams to productionise RAG systems and fine-tuned models

Qualifications

  • 3+ years in infrastructure engineering, MLOps, or backend systems roles
  • Proven experience deploying ML or LLM systems in production environments
  • Strong proficiency in Python and/or Go
  • Strong understanding of distributed systems and scalable backend architecture
  • Hands-on experience with Docker, Kubernetes and CI/CD pipelines
  • Familiarity with model serving frameworks (e.g., vLLM, Triton, TGI)
  • Experience building high-performance APIs for production systems
  • Strong debugging skills across infrastructure and application layers
  • Must have the legal right to work in the US and must not require visa sponsorship

Skills

CI/CDDockerGoKubernetesLLMPythonTGITritonvLLM

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free