Sr. Principal Infrastructure Services

Northern Trust

India · On-site Full-time Senior Today

About the role

About the Role

As a Senior Principal Site Reliability Engineer at Northern Trust, you will focus on developing observability and automation to ensure the reliability and performance of the company's systems and services. Your expertise in software engineering and system operations will drive continuous improvements in platform reliability. This role will involve working with cross‑functional teams to enhance the efficiency of services and bring complete observability across all technologies.

Key Responsibilities

Lead the design and evolution of highly reliable, scalable, and performant distributed systems
Partner with engineering and architecture teams to influence system design decisions
Drive an automation‑first approach by designing and developing tools and platforms
Participate in and lead incident response for production systems
Architect and implement end‑to‑end observability across systems
Identify reliability gaps through data analysis and drive improvement initiatives
Create and maintain clear documentation and knowledge sharing practices
Collaborate with product, development, platform, security, and operations teams
Manage and prioritize multiple reliability‑focused initiatives

Qualifications Required

Bachelors degree in Computer Science, Engineering, or related discipline
15+ years of progressive experience in systems engineering with a strong emphasis on site reliability
7+ years of experience in a technical leadership role
Strong proficiency in one or more modern programming languages
Hands‑on experience with containerization and container orchestration technologies
Proven ability to design and implement observability solutions
Deep understanding of distributed systems, networking fundamentals, and modern software architectures
Exceptional problem‑solving skills and stakeholder orientation
Prior experience designing and delivering Infrastructure as Code (IaC)
Demonstrated success in mentoring and developing technical teams
Hands‑on expertise in implementing automated remediation and corrective actions

Additional Overview

This role at Northern Trust offers you the opportunity to play a pivotal part in ensuring the reliability and performance of the company's systems and services. Your contributions will help drive continuous improvements in platform reliability and efficiency, making a meaningful impact on the organization's success. As a Senior Principal Site Reliability Engineer at Northern Trust, you will focus on developing observability and automation to ensure the reliability and performance of the company's systems and services. Your expertise in software engineering and system operations will drive continuous improvements in platform reliability. This role will involve working with cross‑functional teams to enhance the efficiency of services and bring complete observability across all technologies.

Key Responsibilities

Lead the design and evolution of highly reliable, scalable, and performant distributed systems
Partner with engineering and architecture teams to influence system design decisions
Drive an automation‑first approach by designing and developing tools and platforms
Participate in and lead incident response for production systems
Architect and implement end‑to‑end observability across systems
Identify reliability gaps through data analysis and drive improvement initiatives
Create and maintain clear documentation and knowledge sharing practices
Collaborate with product, development, platform, security, and operations teams
Manage and prioritize multiple reliability‑focused initiatives

Qualifications Required

Bachelors degree in Computer Science, Engineering, or related discipline
15+ years of progressive experience in systems engineering with a strong emphasis on site reliability
7+ years of experience in a technical leadership role
Strong proficiency in one or more modern programming languages
Hands‑on experience with containerization and container orchestration technologies
Proven ability to design and implement observability solutions
Deep understanding of distributed systems, networking fundamentals, and modern software architectures
Exceptional problem‑solving skills and stakeholder orientation
Prior experience designing and delivering Infrastructure as Code (IaC)
Demonstrated success in mentoring and developing technical teams
Hands‑on expertise in implementing automated remediation and corrective actions

Closing Statement

Requirements

15+ years of progressive experience in systems engineering with a strong emphasis on site reliability
7+ years of experience in a technical leadership role
Strong proficiency in one or more modern programming languages
Hands-on experience with containerization and container orchestration technologies
Proven ability to design and implement observability solutions
Deep understanding of distributed systems, networking fundamentals, and modern software architectures
Prior experience designing and delivering Infrastructure as Code (IaC)
Demonstrated success in mentoring and developing technical teams
Hands-on expertise in implementing automated remediation and corrective actions

Responsibilities

Lead the design and evolution of highly reliable, scalable, and performant distributed systems
Partner with engineering and architecture teams to influence system design decisions
Drive an automation-first approach by designing and developing tools and platforms
Participate in and lead incident response for production systems
Architect and implement end-to-end observability across systems
Identify reliability gaps through data analysis and drive improvement initiatives
Create and maintain clear documentation and knowledge sharing practices
Collaborate with product, development, platform, security, and operations teams
Manage and prioritize multiple reliability-focused initiatives

Skills

Infrastructure as Code (IaC)

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Sr. Principal Infrastructure Services

About the role

About the Role

Key Responsibilities

Qualifications Required

Additional Overview

Key Responsibilities

Qualifications Required

Closing Statement

Requirements

Responsibilities

Skills

Similar roles

Staff Automation Engineer

Microsoft Azure Cloud Support Engineer

Remote Backend Developer

Don't send a generic resume