Skip to content
mimi

Site Reliability Engineer

TES The Employment Solution

Montreal · On-site Full-time 2d ago

About the role

About

Elevate system reliability as a Site Reliability Engineer. Design and enhance highly available systems while integrating Agile principles and facilitating Scrum ceremonies for optimal team performance.

In this role, you'll work on crafting, operating, and improving resilient systems. Key responsibilities include defining SLOs and SLIs, maintaining observability, and automating operations through CI/CD practices. Collaborate closely with development teams to improve reliability from the ground up and manage incidents through a blameless post-mortem process.

Key Responsibilities

  • Design and operate highly reliable systems
  • Define and monitor SLO, SLI, and SLA metrics
  • Automate operations using CI/CD and IaC
  • Manage blameless post-mortems and RCA
  • Facilitate Scrum ceremonies and Agile adoption

Requirements

  • Strong expertise in cloud environments (AWS)
  • Proficient in Kubernetes and Docker
  • Experience with automation tools (Terraform, Ansible)
  • Familiarity with Linux systems and security
  • Strong communication and facilitation skills

Lead the way in system reliability while fostering team collaboration and technological excellence.

Skills

AnsibleAWSCI/CDDockerKubernetesLinuxTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free