Manager of Site Reliability Engineering (SRE)
IBM
About the role
Introduction
Join our dynamic IBM CISO Platform team as a Manager of Site Reliability Engineering (SRE). If you are an innovative thinker with a passion for leadership and continuous improvement, this role is for you. You will lead a high-performing SRE team, ensure world-class performance and resilience of IBM's internal security platforms, and collaborate with cross-functional teams to meet compliance and performance goals.
Your role and responsibilities
In this position, you will act as a Software Developer: Generalist and will engage in the design, development, testing, and delivery of cutting-edge technology solutions. You will work in an Agile environment to understand stakeholder needs and contribute to innovative software development.
Your primary responsibilities will include:
- Develop Component-Level Solutions: Design, code, and test innovative software components, ensuring all solutions are unit tested and ready for integration.
- Contribute to CI/CD Pipeline: Enhance the automated CI/CD pipeline to ensure seamless code integration and delivery across various quality stages.
- Debug Customer-Reported Problems: Create code fixes for customer-reported issues, collaborating with stakeholders for efficient resolution.
- Deliver Offerings: Ensure high-quality offerings by utilizing cutting-edge and proven technologies that meet stakeholder expectations.
- Collaborate in Agile Environment: Work closely within an Agile framework to align solutions with business objectives and stakeholder requirements.
Required technical and professional expertise
- Proven experience in managing or leading engineering, SRE, DevOps, or operations teams.
- Oversee the implementation and automation of operational processes, including infrastructure, monitoring, incident response, and runbooks.
- Own end-to-end service reliability, including SLIs/SLOs, capacity planning, performance optimization, and operational health.
- Ensure compliance with IBM CISO and enterprise security standards, regulatory requirements, and risk policies.
- Effectively communicate strategy, risks, operational status, and metrics to leadership and stakeholders.
- Influence technology roadmaps and foster operational readiness for new internal solutions.
- Demonstrated ability to deliver reliable, highly available services.
- Deep understanding of security, compliance, and risk management frameworks.
- Successful track record in automating infrastructure, monitoring, and operational tasks.
- Lead, develop, and mentor Site Reliability Engineers, providing guidance on career development and performance management.
- Promote a high-performing engineering culture focused on accountability, innovation, and continuous improvement.
- Align team objectives with IBM CISO's strategic direction and broader Enterprise & Technology Services goals.
- Plan staffing, manage workloads, and ensure 24/7 service support coverage, including on-call readiness.
- Excellent communication skills with the ability to influence and align across teams.
- Balance support for current systems while leading modernization efforts.
- Experience with Release/Change Management processes.
- Ability to address critical issues outside of business hours.
Preferred technical and professional experience
- Experience with Kubernetes, OpenShift, or similar container orchestration platforms.
- Background in building or managing Cloud-native environments (AWS, Azure, GCP, IBM Cloud), Hybrid Cloud, and on-premises infrastructure.
- Familiarity with observability tools.
- Understanding of networking fundamentals and modern networking architectures.
- Knowledge of Infrastructure as Code (Terraform, Ansible, etc.).
- Exposure to Agile methodologies (Jira, Kanban, Scrum, etc.).
- Working knowledge of scripting/programming languages (e.g., Python).
- Professional Cloud and/or Security certifications (AWS, CISSP, etc.).
IBM is committed to fostering a diverse workplace and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also dedicated to complying with all fair employment practices regarding citizenship and immigration status.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free