Skip to content
mimi

Sr. Manager, Site Reliability Engineering

Open Text Corporation

Richmond Hill · On-site Full-time Lead 1w ago

About the role

About

OpenText is a global leader in information management, where innovation, creativity, and collaboration are the key components of our corporate culture. As a member of our team, you will have the opportunity to partner with the most highly regarded companies in the world, tackle complex issues, and contribute to projects that shape the future of digital transformation.

AI-First. Future-Driven. Human-Centered.

At OpenText, AI is at the heart of everything we do—powering innovation, transforming work, and empowering digital knowledge workers. We are hiring talent AI can't replace to help us shape the future of information management. Join us.

Your Impact

You will manage a globally distributed team of Site Reliability Engineers focused on delivering automation‑first solutions for distributed data services that power OpenText’s SaaS products. Your primary responsibility will be managing technologies including Elasticsearch, OpenSearch, Cassandra, Kafka, Redis, RabbitMQ, and Solr for multi‑tenant cloud applications, leveraging Infrastructure‑as‑Code (Terraform), deployment automation tools like Ansible, and CI/CD pipelines (GitLab), and explore AI‑driven approaches to optimize automation and operational efficiency.

In addition to platform ownership, you will champion automation practices and frameworks that reduce operational toil and improve reliability. You will have the opportunity to shape and scale automation practices that deliver value across diverse operational areas.

We are looking for a strategic, automation‑focused leader who excels at team building, operational execution, and cross‑functional collaboration to deliver highly reliable and scalable services for OpenText’s enterprise‑scale cloud platforms.

What The Role Offers

  • Lead automation‑first delivery for distributed data services (Elasticsearch, Cassandra, Kafka, Redis, RabbitMQ, Solr) supporting multi‑tenant cloud applications.
  • Drive reusable automation solutions using methods such as Infrastructure‑as‑Code (Terraform), deployment automation tools (Ansible), and CI/CD pipelines (GitLab).
  • Champion automation standards and best practices, enabling consistency, scalability, and security.
  • Explore AI‑driven approaches to optimize automation, enhance reliability, and predict operational issues.
  • Evaluate and adopt new data service technologies based on business needs, scalability, and operational fit.
  • Collaborate cross‑functionally with product management, application engineering, and operations teams to identify high‑value automation opportunities.
  • Manage lifecycle of distributed data services, including provisioning, scaling, patching, and failover, with a strong emphasis on automation.
  • Manage the on‑call rotation of the team, ensuring 24x7x365 support coverage is provided, including acting as an escalation point for incident resolution when necessary.
  • Manage third‑party providers, including consultants for expertise and escalation support, and SaaS vendors whose services we operate.
  • Mentor and develop engineers, fostering a culture of innovation and continuous improvement.
  • Measure and optimize service performance, reliability, and cost efficiency through telemetry and automation.
  • Ensure compliance and security are embedded in all automation workflows.
  • Support incident resolution and problem management, leveraging automation for faster recovery and root cause analysis.

What You Need To Succeed

  • BS/MS degree in Computer Engineering, Computer Science, or related field.
  • 8+ years of experience in large‑scale cloud or distributed systems environments.
  • 5+ years of people management experience, building and leading high‑performing technical teams.
  • Proven experience designing and delivering automation solutions for infrastructure and operations, including Infrastructure‑as‑Code and CI/CD practices.
  • Proven experience managing distributed data services such as Elasticsearch, OpenSearch, Cassandra, Kafka, Redis, RabbitMQ, or Solr.
  • Strong proficiency in automation methods including Infrastructure‑as‑Code (Terraform), deployment management tools (Ansible), CI/CD pipelines (GitLab preferred), and other automation frameworks.
  • Experience with private and public cloud platforms (AWS, GCP, Azure, VMware) and container orchestration (Kubernetes, Anthos).
  • Familiarity with cloud‑native data services (e.g., AWS ElastiCache, Google Pub/Sub, Memorystore, Confluent Cloud) and managed offerings.
  • Solid understanding of Linux systems and networking fundamentals.
  • Familiarity with AIOps concepts for operational optimization and predictive analytics.
  • Ability to define automation standards and drive adoption across multiple teams.
  • Experience managing vendor relationships, including consultants and SaaS providers.
  • Strong collaboration skills with cross‑functional teams (product, engineering, operations).
  • Excellent communication and stakeholder management skills.

One Last Thing

OpenText is more than just a corporation, it's a global community where trust is foundational, the bar is raised, and outcomes are owned.

Join us on our mission to drive positive change through privacy, technology, and collaboration. At OpenText, we don't just have a culture; we have character. Choose us because you want to be part of a company that embraces innovation and empowers its employees to make a difference.

OpenText's commitment to diversity and inclusion surpasses legal requirements, evident in our Equal Employment Opportunity Statement of Policy which promotes a respectful and empowering environment for employees of all backgrounds, culture, national origin, race, color, gender, gender identification, sexual orientation, family status, age, veteran status, disability, religion, or other basis protected by applicable laws.

If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please submit a ticket at Ask HR. Our proactive approach fosters collaboration, innovation, and personal growth, enriching OpenText's vibrant workplace.

Compensation

At OpenText, we offer a thoughtfully designed benefits package that supports your physical, emotional, and financial wellbeing. As you move through the hiring process, we’re happy to provide more details about our compensation programs, including variable and commission compensation opportunities for eligible roles, vacation entitlement, and paid time off.

Salary Range

Depending on the candidate’s education, experience, skills, geographical location, and alignment with internal equity and external market, actual salary may vary and be higher or lower than the range posted.

AI Usage Disclosure

As part of our commitment to transparency, we use artificial intelligence (AI) tools to assist in various stages of our recruitment process, including resume screening, candidate matching, interview scheduling, and communications. These tools are designed to improve efficiency, reduce bias, and enhance candidate experience. All decisions regarding hiring are made by qualified human professionals, and we continuously monitor our AI systems to ensure fairness and compliance with applicable regulations.

Requirements

  • BS/MS degree in Computer Engineering, Computer Science, or related field.
  • 8+ years of experience in large-scale cloud or distributed systems environments.
  • 5+ years of people management experience, building and leading high-performing technical teams.
  • Proven experience designing and delivering automation solutions for infrastructure and operations, including Infrastructure-as-Code and CI/CD practices.
  • Proven experience managing distributed data services such as Elasticsearch, OpenSearch, Cassandra, Kafka, Redis, RabbitMQ, or Solr.
  • Strong proficiency in automation methods including Infrastructure-as-Code (Terraform), deployment management tools (Ansible), CI/CD pipelines (GitLab preferred), and other automation frameworks.
  • Experience with private and public cloud platforms (AWS, GCP, Azure, VMware) and container orchestration (Kubernetes, Anthos).
  • Familiarity with cloud-native data services (e.g., AWS ElastiCache, Google Pub/Sub, Memorystore, Confluent Cloud) and managed offerings.
  • Solid understanding of Linux systems and networking fundamentals.
  • Familiarity with AIOps concepts for operational optimization and predictive analytics.
  • Ability to define automation standards and drive adoption across multiple teams.
  • Experience managing vendor relationships, including consultants and SaaS providers.
  • Strong collaboration skills with cross-functional teams (product, engineering, operations).
  • Excellent communication and stakeholder management skills

Responsibilities

  • Manage a globally distributed team of Site Reliability Engineers focused on delivering automation-first solutions for distributed data services that power OpenText’s SaaS products.
  • Manage technologies including Elasticsearch, OpenSearch, Cassandra, Kafka, Redis, RabbitMQ, and Solr for multi-tenant cloud applications, leveraging Infrastructure-as-Code (Terraform), deployment automation tools like Ansible, and CI/CD pipelines (GitLab), and explore AI-driven approaches to optimize automation and operational efficiency.
  • Champion automation practices and frameworks that reduce operational toil and improve reliability.
  • Shape and scale automation practices that deliver value across diverse operational areas.
  • Lead automation-first delivery for distributed data services supporting multi-tenant cloud applications.
  • Drive reusable automation solutions using methods such as Infrastructure-as-Code (Terraform), deployment automation tools (Ansible), and CI/CD pipelines (GitLab).
  • Champion automation standards and best practices, enabling consistency, scalability, and security.
  • Explore AI-driven approaches to optimize automation, enhance reliability, and predict operational issues.
  • Evaluate and adopt new data service technologies based on business needs, scalability, and operational fit.
  • Collaborate cross-functionally with product management, application engineering, and operations teams to identify high-value automation opportunities.
  • Manage lifecycle of distributed data services, including provisioning, scaling, patching, and failover, with a strong emphasis on automation.
  • Manage the on-call rotation of the team, ensuring 24x7x365 support coverage is provided, including acting as an escalation point for incident resolution when necessary.
  • Manage third-party providers, including consultants for expertise and escalation support, and SaaS vendors whose services we operate.
  • Mentor and develop engineers, fostering a culture of innovation and continuous improvement.
  • Measure and optimize service performance, reliability, and cost efficiency through telemetry and automation.
  • Ensure compliance and security are embedded in all automation workflows.
  • Support incident resolution and problem management, leveraging automation for faster recovery and root cause analysis.

Skills

AnsibleAWSAzureCassandraConfluent CloudDockerElasticsearchGCPGitLabGoogle Pub/SubKafkaKubernetesLinuxMemorystoreOpenSearchRabbitMQRedisSolrTerraformVMware

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free