TI
Senior Site Reliability Engineer
TechInsights Inc.
Remote · Canada Full-time Senior Today
About the role
About
Advance your career at TechInsights as a Senior Site Reliability Engineer with a focus on AI operations. Shape AI infrastructure reliability and lead innovative solutions remotely in Canada.
In this senior role, you'll drive critical reliability initiatives and implementation of best practices across AI operations. Collaborating with development teams, you'll oversee SLOs and create reliability frameworks. Your influence will ensure seamless deployments and operational efficacy.
Key Responsibilities
- Manage SLOs, error budgets, and incident response initiatives
- Design architecture strategies for AI agent reliability
- Partner with engineering for compute provisioning and model serving
- Own CI/CD pipeline strategies and drive adoption of SRE practices
- Operate and extend observability across service health and AI workloads
Requirements
- 6-8 years of experience in SRE or DevOps
- Bachelor’s degree in a relevant field
- Strong understanding of AWS architectures
- Expertise in Terraform and CI/CD tools
- Solid background in monitoring tools like Datadog
Lead the future of AI reliability at TechInsights through your technical expertise.
Skills
AWSCI/CDDatadogDevOpsSRETerraform
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free