Principal Site Reliability Engineer

Parallel Domain

Vancouver · On-site Full-time Lead 2mo ago

About the role

Enhance the reliability and security of cloud systems in the Principal Site Reliability Engineer role. Take charge of the infrastructure that supports advanced simulation workloads for autonomous vehicle innovation.

This high-ownership position requires overseeing AWS/EKS environments, collaborating closely with a small team of platform engineers and cross-functional engineering groups. You’ll engage in proactive incident management and infrastructure improvements, ensuring our platform meets the highest standards for performance and security.

Key Responsibilities:

Lead improvements to AWS-based infrastructure reliability
Manage EKS cluster operations including node strategies
Implement Git Ops for streamlined application management
Address complex networking including DNS and load balancing
Drive incident investigations and root cause analyses

Requirements:

5+ years experience in SRE or infrastructure roles
Solid skills in Terraform and infrastructure-as-code
Strong familiarity with AWS; EKS, VPC, IAM essential
Kubernetes operations knowledge and automation skills
Experience with observability tools like Grafana, Elasticsearch

Join in shaping reliable cloud operations for innovative technology solutions while enhancing overall system security and performance.

Skills

AWSAWS EKSAWS IAMAWS VPCElasticsearchGitOpsGrafanaKubernetesTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Principal Site Reliability Engineer

About the role

Key Responsibilities:

Requirements:

Skills

Similar roles

Senior Database Engineer

Team Leads

Staff Engineer

Don't send a generic resume