Skip to content
mimi

Senior Site Reliability Engineer/ API Platform Engineer (AI-First)

VirtualVocations

Rockville · On-site Full-time Senior 1w ago

About the role

Responsibilities

  • Design, build, and operate reliable, scalable infrastructure for APIs, platform services, and AI-enabled applications on AWS and Kubernetes
  • Own and enhance CI/CD pipelines, deployment workflows, and operational tooling to enable safe and fast software delivery
  • Lead incident response, root cause analysis, postmortems, and remediation efforts to continuously improve production reliability

Qualifications

  • Strong experience in Site Reliability Engineering, DevOps, Platform Engineering, or Infrastructure Software Engineering
  • Deep expertise in cloud infrastructure and distributed systems, particularly on AWS
  • Hands-on experience running Kubernetes-based services in production environments
  • Strong experience operating APIs and microservices in production
  • Experience with observability and monitoring tools such as Prometheus, Grafana, or similar systems

Skills

AWSCI/CDDockerGrafanaKubernetesMicroservicesPrometheus

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free