Skip to content
mimi

Site Reliability Engineer

TextNow

Remote · Canada Full-time Senior 5d ago

About the role

Location

Mannville

Compensation

This range is provided by Text Now. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range: CA$/yr - CA$/yr

About Text Now

We believe communication belongs to everyone. We exist to democratize phone service. Text Now is evolving the way the world connects and that's because we're made up of people with curious minds who bring an optimistic, yet critical lens into the work we do. We're the largest provider of free phone service in the nation. And we're just getting started.

Join us in our mission to break down barriers to communication and free the flow of conversation for people everywhere.

Role: Senior Site Reliability Engineer

Text Now is looking for a motivated Senior Site Reliability Engineer to own infrastructure, monitoring, logging, CI/CD, reliability and everything in between!

What you'll do

  • Ensure System Reliability: Design, build, and maintain scalable, resilient, and highly available systems to support Text Now’s infrastructure and services
  • Automation & Infrastructure as Code: Develop and maintain automation using Terraform, Ansible, and other tools to enable efficient deployment, scaling, and operations of cloud‑based systems (AWS preferred)
  • Incident Response & On‑Call Support: Participate in an on‑call rotation, troubleshoot issues, and drive incident resolution to minimize downtime and improve system performance. Conduct post‑mortems and implement corrective actions to enhance reliability
  • Performance Monitoring & Optimization: Implement and improve observability tools, logging, and monitoring solutions to identify and mitigate potential system issues proactively
  • Collaboration & Cross‑Team Engagement: Work closely with software engineers, Dev Ops, and product teams to align technical efforts with business objectives and improve system reliability from development to production
  • Continuous Improvement: Identify areas for improvement in architecture, automation, and operational practices. Contribute to the design and implementation of new SRE best practices

Qualifications & fit

  • 5+ years of experience in an operationally focused role (SRE, Dev Ops, or Infrastructure Engineering) with a deep understanding of reliability, scalability, and performance optimization
  • Hands‑on experience with AWS, Git Hub, Terraform, Ansible, or similar tools to build and manage cloud infrastructure efficiently
  • Incident Management expertise: comfortable handling production incidents, analyzing root causes, and implementing long‑term fixes to prevent recurrence
  • Automation & Observability focused: passion for reducing toil through scripting and automation while ensuring robust observability using logging, metrics, and monitoring tools
  • Collaborative & impact‑driven: enjoys working cross‑functionally with engineers, product teams, and leadership to drive meaningful improvements to system reliability

More about Text Now

Our Values

  • Customer Obsessed (We strive to have a deep understanding of our customers)
  • Do Right By Our People (We treat each other with fairness, respect, and integrity)
  • Accept the Challenge (We adopt a Yes, We Can mindset to achieve ambitious goals)
  • Act Like an Owner (We treat this company like it's our own... because it is!)
  • Give a Damn! (We are deeply committed and passionate about our work and achieving results)

Benefits, Culture, & More

  • Strong work life blend
  • Flexible work arrangements (WFH, remote, or access to one of our office spaces)
  • Employee Stock Options
  • Unlimited vacation
  • Competitive pay and benefits
  • Parental leave
  • Benefits for both physical and mental well‑being (wellness credit and L&D credit)
  • We travel a few times a year for various team events, company‑wide off‑sites, and more

Diversity and Inclusion

Text Now's mission is built around inclusion and offering a service for EVERYONE. We believe that diversity of thought and inclusion of others promotes a greater feeling of belonging and higher levels of engagement. We know that if we work together, we can do amazing things, and that our differences are what make our product and company great.

Requirements

  • 5+ years of experience in an operationally focused role (SRE, Dev Ops, or Infrastructure Engineering) with a deep understanding of reliability, scalability, and performance optimization
  • Hands-on experience with AWS, Git Hub, Terraform, Ansible, or similar tools to build and manage cloud infrastructure efficiently
  • Incident Management expertise: comfortable handling production incidents, analyzing root causes, and implementing long-term fixes to prevent recurrence
  • Automation & Observability focused: passion for reducing toil through scripting and automation while ensuring robust observability using logging, metrics, and monitoring tools
  • Collaborative & impact-driven: enjoys working cross-functionally with engineers, product teams, and leadership to drive meaningful improvements to system reliability

Responsibilities

  • Design, build, and maintain scalable, resilient, and highly available systems to support Text Now’s infrastructure and services
  • Develop and maintain automation using Terraform, Ansible, and other tools to enable efficient deployment, scaling, and operations of cloud-based systems (AWS preferred)
  • Participate in an on-call rotation, troubleshoot issues, and drive incident resolution to minimize downtime and improve system performance.
  • Conduct post-mortems and implement corrective actions to enhance reliability
  • Implement and improve observability tools, logging, and monitoring solutions to identify and mitigate potential system issues proactively
  • Work closely with software engineers, Dev Ops, and product teams to align technical efforts with business objectives and improve system reliability from development to production
  • Identify areas for improvement in architecture, automation, and operational practices.
  • Contribute to the design and implementation of new SRE best practices

Benefits

Employee Stock OptionsUnlimited vacationParental leavewellness creditL&D credit

Skills

AnsibleAWSCI/CDDockerGitGitHubInfrastructure as CodeLoggingMonitoringObservabilitySRETerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free