Skip to content
mimi

Principal Site Reliability Engineer

Jobgether

Southey · On-site Full-time Lead 5d ago

About the role

About

Enhance and manage cloud infrastructure as a Principal Site Reliability Engineer. Drive reliability, scalability, and security across AWS/EKS environments while collaborating with engineering and customer teams.

In this hands-on role, you will take ownership of mission-critical workloads, optimizing performance and implementing automated solutions. Your deep technical expertise will be necessary for leading incident responses, ensuring security compliance, and supporting CI/CD processes. Ideal for a self-motivated individual who excels in a dynamic environment, this position focuses on shaping infrastructure strategies that greatly affect customer success.

Key Responsibilities:

  • Own and enhance cloud infrastructure for availability
  • Manage Kubernetes clusters' operation and health
  • Lead incident response and systemic fixes to reduce downtime
  • Oversee cloud security and IAM governance
  • Drive infrastructure design and cost optimization strategies

Requirements:

  • 5+ years in SRE or Dev Ops roles
  • Strong AWS and Kubernetes expertise required
  • Proficiency in infrastructure-as-code tools like Terraform
  • Experience with monitoring tools and CI/CD pipelines
  • Strong scripting skills in Python and Bash

Utilize your expertise to optimize a complex cloud infrastructure and ensure seamless operation across mission-critical environments.

Skills

AWSBashCI/CDIAMKubernetesPythonTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free