Remote Site Reliability Engineer Ensuring Scalable Infrastructure
Newton
About the role
About
Elevate your career as a Site Reliability Engineer focused on enhancing infrastructure reliability and resilience. Collaborate with a dedicated remote team to ensure systems operate seamlessly and efficiently. In this critical role, you will design and manage the operational excellence of our services. As a senior-level engineer, you'll drive improvements, prevent incidents, and facilitate response efforts through well-defined metrics and post-mortems. Your expertise will help to maintain a robust and reliable system that meets user needs. Focus on driving reliability and operational readiness, using your skills to create resilient systems that make a difference.
Responsibilities
- Implement improvements for reliability and fault tolerance
- Manage incidents using your technical skills
- Provide on-call support for critical services
- Define and maintain SLIs, SLOs, and SLAs
- Improve observability across all systems
Requirements
- Experience with AWS or similar cloud environments
- On-call experience with critical systems
- Familiarity with chaos engineering tools
- Strong debugging skills in production environments
- Proficiency in scripting or development languages
Requirements
- Experience with AWS or similar cloud environments
- On-call experience with critical systems
- Familiarity with chaos engineering tools
- Strong debugging skills in production environments
- Proficiency in scripting or development languages
Responsibilities
- Implement improvements for reliability and fault tolerance
- Manage incidents using your technical skills
- Provide on-call support for critical services
- Define and maintain SLIs, SLOs, and SLAs
- Improve observability across all systems
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free