Skip to content
mimi

Site Reliability Engineer (SRE) / Scrum Master

COFOMO

Montreal · On-site Full-time 4d ago

About the role

Responsibilities

  • Design, operate, and improve highly available, resilient, and secure systems
  • Define and track SLOs, SLIs, and SLAs
  • Implement and maintain observability (monitoring, logging, alerting)
  • Automate operations (CI/CD, infrastructure as code, self‑remediation)
  • Handle incidents (post‑mortems without blame, RCA)
  • Collaborate with development teams to improve shift‑left reliability
  • Participate in architectural decisions and technical reviews
  • Optimize cost, performance, and system capacity
  • Facilitate Scrum ceremonies (Sprint Planning, Daily, Review, Retrospective)
  • Support the team in the adoption of Agile and DevOps principles
  • Remove obstacles and protect the team from external interruptions
  • Foster collaboration between teams (Dev, Ops, Security, Product)
  • Work with the Product Owner on the backlog (prioritization, quality of user stories)
  • Measure and improve team performance (velocity, flow, quality)
  • Encourage a culture of continuous improvement and collective responsibility
  • Act as an Agile leader, servant and coach

Requirements

  • Possess Scrum certifications (CSM, PSM, SAFe), as well as AWS certification (an asset)
  • Have a good experience with Kubernetes / Docker
  • Have CI/CD experience (GitHub Actions, DevOps, etc.)
  • Have proven experience as a Scrum Master or similar role
  • Demonstrate experience in high‑criticality environments (an asset)
  • Have experience with observability tools (Splunk, Datadog, etc.)
  • Have a solid understanding of Linux systems, networks and security
  • Have an excellent understanding of cloud environments (AWS)
  • Have a good level of scripting (Python, Bash, Go, etc.)

Requirements

  • Possess Scrum certifications (CSM, PSM, SAFe)
  • Possess AWS certification
  • Have a good experience with Kubernetes / Docker
  • Have CI/CD experience (GitHub Actions, DevOps, etc.)
  • Have proven experience as a Scrum Master or similar role
  • Demonstrate experience in high-criticality environments
  • Have experience with observability tools (Splunk, Datadog, etc.)
  • Have a solid understanding of Linux systems, networks and security
  • Have an excellent understanding of cloud environments (AWS)
  • Have a good level of scripting (Python, Bash, Go, etc.)

Responsibilities

  • Design, operate, and improve highly available, resilient, and secure systems
  • Define and track SLOs, SLIs, and SLAs
  • Implement and maintain observability (monitoring, logging, alerting)
  • Automate operations (CI/CD, infrastructure as code, self-remediation)
  • Handle incidents (post-mortems without blame, RCA)
  • Collaborate with development teams to improve shift-left reliability
  • Participate in architectural decisions and technical reviews
  • Optimize cost, performance, and system capacity
  • Facilitate Scrum ceremonies (Sprint Planning, Daily, Review, Retrospective)
  • Support the team in the adoption of Agile and DevOps principles
  • Remove obstacles and protect the team from external interruptions
  • Foster collaboration between teams (Dev, Ops, Security, Product)
  • Work with the Product Owner on the backlog (prioritization, quality of user stories)
  • Measure and improve team performance (velocity, flow, quality)
  • Encourage a culture of continuous improvement and collective responsibility
  • Act as an Agile leader, servant and coach

Skills

AWSBashCI/CDDockerDevOpsGoGitHub ActionsKubernetesLinuxPythonSAFeScrumSplunk

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free