Skip to content
mimi

Principal Site Reliability Engineer

Parallel Domain

Vancouver · On-site Full-time Lead 1mo ago

About the role

Enhance the reliability and security of cloud systems in the Principal Site Reliability Engineer role. Take charge of the infrastructure that supports advanced simulation workloads for autonomous vehicle innovation.

This high-ownership position requires overseeing AWS/EKS environments, collaborating closely with a small team of platform engineers and cross-functional engineering groups. You’ll engage in proactive incident management and infrastructure improvements, ensuring our platform meets the highest standards for performance and security.

Key Responsibilities:

  • Lead improvements to AWS-based infrastructure reliability
  • Manage EKS cluster operations including node strategies
  • Implement Git Ops for streamlined application management
  • Address complex networking including DNS and load balancing
  • Drive incident investigations and root cause analyses

Requirements:

  • 5+ years experience in SRE or infrastructure roles
  • Solid skills in Terraform and infrastructure-as-code
  • Strong familiarity with AWS; EKS, VPC, IAM essential
  • Kubernetes operations knowledge and automation skills
  • Experience with observability tools like Grafana, Elasticsearch

Join in shaping reliable cloud operations for innovative technology solutions while enhancing overall system security and performance.

Skills

AWSAWS EKSAWS IAMAWS VPCElasticsearchGitOpsGrafanaKubernetesTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free