Site Reliability Engineer Ottawa, ON, CA + 2 more
Qlik
About the role
About Qlik
A Gartner® Magic Quadrant™ Leader for 15 years in a row, Qlik transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster.
We excel in integration and governance solutions that work with diverse data sources, and our real‑time analytics uncover hidden patterns, empowering teams to address complex challenges and seize new opportunities.
We were also recognized as one of National Capital Region's 2025 Top Employers in Canada.
Find out more about “Life at Qlik” on social media and see all other opportunities to join us and our values on our careers page.
Culture & Benefits
- Genuine career progression pathways and mentoring programs
- Culture of innovation, technology, collaboration, and openness
- Flexible, diverse, and international work environment
- Giving back is a huge part of our culture – extra “change the world” day, personal development day, and encouragement to participate in community initiatives
Role Overview – Site Reliability Engineer
As a Site Reliability Engineer at Qlik, you’ll sit at the heart of our cloud ecosystem, helping power the reliability, security, and scalability of Qlik and Talend Cloud services used around the world.
You’ll work on systems operating at serious scale – supporting millions of transactions across a global cloud environment – while shaping how reliability engineering is done across the business.
You won’t just “keep the lights on.” You’ll design, improve, automate, and elevate how modern cloud platforms perform.
Responsibilities
- Solve real‑scale challenges – work on reliability and performance across a global cloud platform handling millions of transactions
- Engineer, not just operate – build tooling, automation, alerts, and scalable infrastructure patterns that prevent problems before they happen
- Collaborate with highly skilled teams – partner with Global SRE, Architecture, Platform, and Domain Engineering teams to influence how infrastructure is designed from the ground up
- Work with modern cloud‑native technologies – Kubernetes, IaC, observability tooling, autoscaling, secret management, CI/CD; hands‑on with today’s most relevant technologies
- Shape best practices – help define and champion cloud optimization and reliability standards across the organization
- Grow technical influence – act as a go‑to resource for reliability, incident management, cloud engineering, and production operations
- Continuously evolve – stay close to emerging tools and practices, contributing to ongoing improvements in our cloud environment
Impact
- Increase reliability and availability by implementing resilient infrastructure patterns and performance optimizations
- Reduce incidents and recovery time through better observability, automation, and proactive engineering
- Strengthen scalability by designing infrastructure that adapts seamlessly to growth
- Improve cloud efficiency by driving optimization best practices across AWS and Azure environments
- Resolve complex system challenges across infrastructure, networking, applications, and distributed systems
- On‑Call Support: participate in on‑call duties to maintain availability and performance of our cloud infrastructure, providing regular updates on project status and activities (first‑line incident response)
- Elevate engineering standards by mentoring peers and embedding reliability‑first thinking into development workflows
Requirements / Qualifications
- Cloud engineering skill across AWS and/or Azure, including hands‑on experience supporting production systems running on Kubernetes at scale
- Infrastructure as Code and microservices experience, using tools such as Terraform, Crossplane or Ansible, with a strong understanding of operating distributed systems in live environments
- Automation and engineering mindset, with proficiency in Python, Go or Bash, plus experience building and improving CI/CD pipelines and autoscaling strategies
- Observability and incident management depth, including Prometheus, Grafana, OpenTelemetry, distributed tracing, and SIEM tooling – with the ability to turn insights into reliability improvements
- Security and networking knowledge, including secret management (e.g., Vault, AWS SSM) and familiarity with infrastructure security and compliance best practices
- Cloud‑native tooling experience, including Helm (managing and creating charts) and exposure to modern database and ecosystem technologies such as MongoDB
- Strong analytical thinking, with the ability to troubleshoot complex issues across infrastructure, networking, and application layers
- Curiosity and collaboration at their core; a passion for learning, sharing ideas and insight, and comfort with the on‑call support rotation (experience here is also welcome)
Location
Ottawa, Canada, in a hybrid working model, or select locations can be remote across Ontario.
Compensation & Benefits
- Anticipated base salary range: $100,000 CAD to $133,000 CAD (final compensation based on location, skills, education, experience, and business needs)
- Comprehensive benefits package (eligibility controlled by applicable Qlik plan documents and policies)
Application Process
- Qlik’s recruitment team uses AI‑enabled tools to help assess and evaluate candidates’ qualifications. Any hiring decision will involve a human review; you will not be subject to decisions based solely on automated means.
- If you need assistance applying for a role due to a disability, please submit your request via email to the address provided in the posting. All information will be treated according to Qlik’s privacy policy.
- Qlik may only respond to emails related to accommodation requests.
Legal & Equal Opportunity
- Qlik is not accepting unsolicited assistance from search firms for this employment opportunity. Please, no phone calls or emails. All resumes submitted by search firms to any employee at Qlik via email, the Internet, or any form without a valid written search agreement will be deemed the sole property of Qlik. No fee will be paid in the event the candidate is hired by Qlik as a result of the referral or through other means.
Requirements
- Cloud engineering skill across AWS and/or Azure, including hands-on experience supporting production systems running on Kubernetes at scale.
- Infrastructure as Code and microservices experience , using tools such as Terraform, Crossplane or Ansible, with a strong understanding of operating distributed systems in live environments.
- Automation and engineering mindset , with proficiency in Python, Go or Bash, plus experience building and improving CI/CD pipelines and autoscaling strategies.
- Observability and incident management depth , including Prometheus, Grafana, OpenTelemetry, distributed tracing, and SIEM tooling - with the ability to turn insights into reliability improvements.
- Security and networking knowledge , including secret management (e.g., Vault, AWS SSM) and familiarity with infrastructure security and compliance best practices.
- Cloud-native tooling experience , including Helm (managing and creating charts) and exposure to modern database and ecosystem technologies such as MongoDB.
- Strong analytical thinking , with the ability to troubleshoot complex issues across infrastructure, networking, and application layers.
- Curiosity and collaboration at their core; a passion for learning, sharing ideas and insight and comfort with the on-call support rotation - experience here is also welcome.
Responsibilities
- Solve real scale challenges - Work on reliability and performance across a global cloud platform handling millions of transactions.
- Engineer, not just operate - Build tooling, automation, alerts, and scalable infrastructure patterns that prevent problems before they happen.
- Collaborate with highly skilled teams - Partner with Global SRE, Architecture, Platform, and Domain Engineering teams to influence how infrastructure is designed from the ground up.
- Work with modern cloud-native technologies - Kubernetes, IaC, observability tooling, autoscaling, secret management, CI/CD - you'll be hands-on with today's most relevant technologies.
- Shape best practices - Help define and champion cloud optimization and reliability standards across the organization.
- Grow your technical influence - Act as a go-to resource for reliability, incident management, cloud engineering, and production operations.
- Continuously evolve - Stay close to emerging tools and practices, contributing to ongoing improvements in our cloud environment.
- Increase reliability and availability by implementing resilient infrastructure patterns and performance optimizations.
- Reduce incidents and recovery time through better observability, automation, and proactive engineering.
- Strengthen scalability by designing infrastructure that adapts seamlessly to growth.
- Improve cloud efficiency by driving optimization best practices across AWS and Azure environments.
- Resolve complex system challenges across infrastructure, networking, applications, and distributed systems.
- Participate in on-call duties to maintain the availability and performance of our cloud infrastructure, providing regular updates on project status and activities.
- Elevate engineering standards by mentoring peers and embedding reliability-first thinking into development workflows.
Benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free