Senior Site Reliability Engineer
TechInsights Inc.
About the role
About
TechInsights seeks a Senior Site Reliability Engineer to enhance AI operations from anywhere in Canada. Oversee reliability strategies, manage error budgets, and collaborate closely with engineering teams. You’ll be instrumental in shaping the technical architecture and reliability practices at TechInsights. Your role focuses on end-to-end reliability initiatives, including defining service-level objectives and leading incident management. Through collaboration and mentorship, you will elevate technical standards and advance team capabilities.
Key Responsibilities
- Develop SLOs and manage production service reliability metrics
- Architect solutions for AI agent failure containment
- Mentor junior engineers and enhance team capabilities
- Drive continuous improvement in operational processes
- Utilize Datadog for service health monitoring and automation
Requirements
- 6-8 years in site reliability engineering
- Bachelor's degree in Computer Science or applicable field
- Proficiency with AWS services and multiregion patterns
- Strong skills in Terraform and operational tooling
- Experienced in managing CI/CD pipelines
Transform site reliability for AI operations at TechInsights and drive impactful changes.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free