Principal Site Reliability Engineer for AWS and Kubernetes Operations
Parallel Domain
About the role
About
Shape cloud reliability as a Principal Site Reliability Engineer leading AWS and Kubernetes operations. Enhance infrastructure for high-stake simulation workloads in autonomous vehicle technology.
In this role, you’ll be responsible for the overall health and performance of a scalable cloud architecture, managing EKS clusters, and ensuring security compliance. You will also drive initiatives for incident response and proactive issue prevention while collaborating across engineering and customer-facing teams to deliver a seamless experience.
Key Responsibilities:
- Evolve AWS infrastructure and enhance platform performance
- Lead incident investigations and deploy automated solutions
- Oversee security governance for cloud services
- Collaborate with customer teams for optimal availability
- Improve CI/CD pipelines for seamless development
Requirements:
- 5+ years in infrastructure engineering or SRE
- Proficient in infrastructure-as-code with Terraform
- Deep knowledge of AWS services and Kubernetes
- Strong networking fundamentals and security awareness
- Experience with monitoring tools like Prometheus
Become a key player in building reliable and high-performance cloud systems that foster innovation in autonomous vehicle simulation and beyond.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free