Skip to content
mimi

Senior Site Reliability Engineer

TechInsights Inc.

Remote · Canada Full-time Senior Today

About the role

About

Advance your career at TechInsights as a Senior Site Reliability Engineer with a focus on AI operations. Shape AI infrastructure reliability and lead innovative solutions remotely in Canada.

In this senior role, you'll drive critical reliability initiatives and implementation of best practices across AI operations. Collaborating with development teams, you'll oversee SLOs and create reliability frameworks. Your influence will ensure seamless deployments and operational efficacy.

Key Responsibilities

  • Manage SLOs, error budgets, and incident response initiatives
  • Design architecture strategies for AI agent reliability
  • Partner with engineering for compute provisioning and model serving
  • Own CI/CD pipeline strategies and drive adoption of SRE practices
  • Operate and extend observability across service health and AI workloads

Requirements

  • 6-8 years of experience in SRE or DevOps
  • Bachelor’s degree in a relevant field
  • Strong understanding of AWS architectures
  • Expertise in Terraform and CI/CD tools
  • Solid background in monitoring tools like Datadog

Lead the future of AI reliability at TechInsights through your technical expertise.

Skills

AWSCI/CDDatadogDevOpsSRETerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free