Principal Site Reliability Engineer
Parallel Domain
About the role
Drive the reliability and performance of cloud systems as a Principal Site Reliability Engineer. Elevate AWS infrastructure for demanding workloads in autonomous vehicle development while collaborating closely with engineering teams.
This high-ownership role emphasizes your expertise in AWS and Kubernetes, where you will oversee EKS operations, support deployment, and manage cloud security. You'll play a vital part in incident response, enhancing our monitoring capabilities and implementing best practices across cloud environments while ensuring high availability for enterprise customers.
Key Responsibilities:
- Own AWS infrastructure and improve performance
- Manage EKS cluster operations for production
- Support GitOps deployment and infrastructure-as-code
- Design automated remediation systems to reduce MTTR
- Lead security governance and IAM management
Requirements:
- 5+ years in SRE, DevOps, or infrastructure roles
- Proficiency with Terraform and multi-environment patterns
- Deep experience with AWS services and Kubernetes
- Solid networking expertise in cloud environments
- Comfort with Python and Bash scripting
Leverage your expertise to shape a reliable, secure cloud infrastructure that supports critical simulation workloads in an advanced technology domain.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free