V
Senior Site Reliability Engineer/ API Platform Engineer (AI-First)
VirtualVocations
Rockville · On-site Full-time Senior 1w ago
About the role
Responsibilities
- Design, build, and operate reliable, scalable infrastructure for APIs, platform services, and AI-enabled applications on AWS and Kubernetes
- Own and enhance CI/CD pipelines, deployment workflows, and operational tooling to enable safe and fast software delivery
- Lead incident response, root cause analysis, postmortems, and remediation efforts to continuously improve production reliability
Qualifications
- Strong experience in Site Reliability Engineering, DevOps, Platform Engineering, or Infrastructure Software Engineering
- Deep expertise in cloud infrastructure and distributed systems, particularly on AWS
- Hands-on experience running Kubernetes-based services in production environments
- Strong experience operating APIs and microservices in production
- Experience with observability and monitoring tools such as Prometheus, Grafana, or similar systems
Skills
AWSCI/CDDockerGrafanaKubernetesMicroservicesPrometheus
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free