M
Senior Site Reliability Engineer (Senior SRE)
Mondo
Remote · US Contract Senior $70 – $90/hr 1mo ago
About the role
About
The Senior Site Reliability Engineer is responsible for improving reliability, scalability, and operational excellence across distributed systems through automation, observability, and engineering best practices.
Responsibilities
- Lead reliability and recovery planning for critical systems and services
- Define and maintain SLIs, SLOs, and error budget practices
- Drive incident response efforts and lead complex outage investigations
- Conduct root cause analysis and implement corrective actions
- Design and build automation to reduce operational toil
- Improve CI/CD pipelines, deployment workflows, and rollback strategies
- Develop and maintain observability tooling for metrics, logs, and tracing
- Partner with release and change management teams on release readiness
- Mentor engineers and promote operational excellence across teams
- Influence system architecture to improve reliability and scalability
Requirements
Must-Haves:
- Bachelor's degree in Computer Science, Engineering, or related field
- 6–10 years of experience in SRE, DevOps, platform engineering, or software engineering
- Strong experience with AWS and distributed systems
- Professional experience performing root cause analysis and incident management
- Experience documenting SRE systems and operational processes
- Strong programming skills across multiple languages
- Experience with observability tools, monitoring, logging, and tracing
- Experience with CI/CD pipelines and deployment automation
- Familiarity with containerization and orchestration technologies
- Strong troubleshooting, analytical, and problem-solving skills
- Ability to work effectively in regulated or enterprise environments
Nice-to-Haves:
- Experience with Infrastructure as Code tools such as CloudFormation
- Experience with Spring, Spring Boot, React, or Angular
- Experience with Tomcat, Netty, Node.js, or Next.js
- Experience with relational and non-relational databases
- Experience working in Agile/Scrum environments
- Experience with ITSM tools such as ServiceNow
- Familiarity with ITIL-based release and change management processes
- Knowledge of security compliance frameworks such as ISO 27001 or SOC 2
Benefits
- Health insurance
- Retirement plan
- Paid sick leave
Skills
AWSCI/CDCloudFormationDockerKubernetesNode.jsReactSpringSpring Boot
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free