Principal Site Reliability Engineer

Jobgether

Southey · On-site Full-time Lead 3mo ago

About the role

About

Enhance and manage cloud infrastructure as a Principal Site Reliability Engineer. Drive reliability, scalability, and security across AWS/EKS environments while collaborating with engineering and customer teams.

In this hands-on role, you will take ownership of mission-critical workloads, optimizing performance and implementing automated solutions. Your deep technical expertise will be necessary for leading incident responses, ensuring security compliance, and supporting CI/CD processes. Ideal for a self-motivated individual who excels in a dynamic environment, this position focuses on shaping infrastructure strategies that greatly affect customer success.

Key Responsibilities:

Own and enhance cloud infrastructure for availability
Manage Kubernetes clusters' operation and health
Lead incident response and systemic fixes to reduce downtime
Oversee cloud security and IAM governance
Drive infrastructure design and cost optimization strategies

Requirements:

5+ years in SRE or Dev Ops roles
Strong AWS and Kubernetes expertise required
Proficiency in infrastructure-as-code tools like Terraform
Experience with monitoring tools and CI/CD pipelines
Strong scripting skills in Python and Bash

Utilize your expertise to optimize a complex cloud infrastructure and ensure seamless operation across mission-critical environments.

Skills

AWSBashCI/CDIAMKubernetesPythonTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Principal Site Reliability Engineer

About the role

About

Key Responsibilities:

Requirements:

Skills

Similar roles

backend developer

Fullstack Software Architect / Lead Engineer

Java Backend Engineer (all gender)

Don't send a generic resume