Skip to content
mimi

Principal Site Reliability Engineer for AWS and Kubernetes Operations

Parallel Domain

Barrie · On-site Full-time Lead 1w ago

About the role

About

Shape cloud reliability as a Principal Site Reliability Engineer leading AWS and Kubernetes operations. Enhance infrastructure for high-stake simulation workloads in autonomous vehicle technology.

In this role, you’ll be responsible for the overall health and performance of a scalable cloud architecture, managing EKS clusters, and ensuring security compliance. You will also drive initiatives for incident response and proactive issue prevention while collaborating across engineering and customer-facing teams to deliver a seamless experience.

Key Responsibilities:

  • Evolve AWS infrastructure and enhance platform performance
  • Lead incident investigations and deploy automated solutions
  • Oversee security governance for cloud services
  • Collaborate with customer teams for optimal availability
  • Improve CI/CD pipelines for seamless development

Requirements:

  • 5+ years in infrastructure engineering or SRE
  • Proficient in infrastructure-as-code with Terraform
  • Deep knowledge of AWS services and Kubernetes
  • Strong networking fundamentals and security awareness
  • Experience with monitoring tools like Prometheus

Become a key player in building reliable and high-performance cloud systems that foster innovation in autonomous vehicle simulation and beyond.

Skills

AWSCI/CDKubernetesPrometheusTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free