Senior Site Reliability Engineer/ API Platform Engineer (AI-First)

VirtualVocations

Rockville · On-site Full-time Senior 3mo ago

About the role

Design, build, and operate reliable, scalable infrastructure for APIs, platform services, and AI-enabled applications on AWS and Kubernetes
Own and enhance CI/CD pipelines, deployment workflows, and operational tooling to enable safe and fast software delivery
Lead incident response, root cause analysis, postmortems, and remediation efforts to continuously improve production reliability

Strong experience in Site Reliability Engineering, DevOps, Platform Engineering, or Infrastructure Software Engineering
Deep expertise in cloud infrastructure and distributed systems, particularly on AWS
Hands-on experience running Kubernetes-based services in production environments
Strong experience operating APIs and microservices in production
Experience with observability and monitoring tools such as Prometheus, Grafana, or similar systems

AWSCI/CDDockerGrafanaKubernetesMicroservicesPrometheus

skoobe

Wistar Informatik AG

Alten

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.