Skip to content
mimi

Sr. DevOps/Site Reliability Engineer

Jobs via Dice

Arlington Heights · Hybrid Contract Senior 2w ago

About the role

About

We are looking for a Senior Site Reliability Engineer (SRE) with deep experience in AWS infrastructure, automation, observability, and production support. As an SRE, you will ensure our cloud-native systems are resilient, scalable, and efficient, driving reliability through code, not just processes.

Key Responsibilities

  • Design, implement, and maintain scalable, secure, and highly available infrastructure on AWS
  • Develop and improve CI/CD pipelines, Infrastructure as Code (IaC) using Terraform, Harness
  • Own and implement monitoring, alerting, logging, and distributed tracing with tools like Dynatrace/ Datadog
  • Troubleshoot production incidents, conduct blameless postmortems, and improve incident response processes
  • Optimize systems for cost, performance, and reliability
  • Drive chaos engineering and resilience testing
  • Collaborate with development teams to embed SRE practices like SLAs, SLOs, and error budgets
  • Mentor junior SREs and promote DevOps/SRE culture across the organization

Basic Qualifications

  • Strong Experience In SRE, DevOps, Or Cloud Engineering
  • Expertise in AWS core services (EC2, ECS/EKS, Lambda, S3, VPC, RDS, IAM, CloudFront, etc.)
  • Hands-on Experience With Terraform, Ansible, Or Other IaC Tools
  • Strong scripting/coding skills (Python, Go, Shell, etc.)
  • Experience With Kubernetes, Containerization, And Orchestration
  • Deep knowledge of Linux systems and networking

Preferred Qualifications

  • Experience With Service Meshes (e.g., Istio, App Mesh)
  • Familiarity with AWS Well-Architected Framework
  • Experience Building Self-healing Systems And Automated Remediation
  • Background in security, compliance, or multi-account/multi-region AWS architectures

Certifications (Optional/Preferred)

  • AWS Certified DevOps Engineer – Professional
  • AWS Certified Solutions Architect – Professional

Skills

AnsibleAWSCloudFrontDatadogDevOpsDockerDynatraceEC2ECSEKSError budgetsGoHarnessIAMIaCIstioKubernetesLambdaLinuxMonitoringNetworkingObservabilityPythonRDSReliabilityS3ScalabilitySecurityShellSite Reliability EngineeringSLAsSLOsTerraformVPC

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free