Site Reliability Engineer

TES The Employment Solution

Montreal · On-site Full-time 2mo ago

About the role

About

Elevate system reliability as a Site Reliability Engineer. Design and enhance highly available systems while integrating Agile principles and facilitating Scrum ceremonies for optimal team performance.

In this role, you'll work on crafting, operating, and improving resilient systems. Key responsibilities include defining SLOs and SLIs, maintaining observability, and automating operations through CI/CD practices. Collaborate closely with development teams to improve reliability from the ground up and manage incidents through a blameless post-mortem process.

Key Responsibilities

Design and operate highly reliable systems
Define and monitor SLO, SLI, and SLA metrics
Automate operations using CI/CD and IaC
Manage blameless post-mortems and RCA
Facilitate Scrum ceremonies and Agile adoption

Requirements

Strong expertise in cloud environments (AWS)
Proficient in Kubernetes and Docker
Experience with automation tools (Terraform, Ansible)
Familiarity with Linux systems and security
Strong communication and facilitation skills

Lead the way in system reliability while fostering team collaboration and technological excellence.

Skills

AnsibleAWSCI/CDDockerKubernetesLinuxTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Site Reliability Engineer

About the role

About

Key Responsibilities

Requirements

Skills

Similar roles

MCP Engineer / AI Backend Engineer

Senior Database Engineer

Team Leads

Don't send a generic resume