Skip to content
mimi

Lead Cloud Site Reliability Engineer

Bank of Scotland

Halifax Regional Municipality · Hybrid Full-time Lead £90k – £106k/yr 2w ago

About the role

End Date

Tuesday 31 March 2026

Salary Range

92,701 - 109,060

Flexible Working Options

We support flexible working - click here for more information on flexible working options
Hybrid Working, Job Share

Job Description Summary

.

Job Description

Lead Site Reliability Engineer - Public Cloud Platform
Location: Halifax, Leeds, Bristol or Manchester
Salary: 90,440- 106,400
Working Pattern: Hybrid (2 days in office per week)

About this opportunity

At Lloyds Banking Group, we're all driven by our purpose, to Help Britain Prosper. It's why we exist - it's our reason to get out of bed in the morning. The choices we make, our success and our future really matter.

But the world is changing, fast. And we're changing too. It's never been a more exciting time to join us as we transform our business to shape finance as a force for good. We're modernising with cloud, a platform that is quick, secure and resilient for customers and easy, modern and green for developers.

We're looking for a Lead Site Reliability Engineer (SRE) to help us strengthen reliability, observability and operational excellence across our Azure and GCP platforms. This is an exciting opportunity to lead a team of highly skilled SREs, influence engineering standards across the Group, and drive improvements that make a tangible difference to both customers and engineers.

What you'll do

As a Lead SRE, you will:

  • Lead a team of SREs (up to ~15) and create a culture of continuous improvement, learning, and engineering excellence.
  • Work closely with application teams during application migrations to the Cloud
  • Work closely with Product Owners and Engineering Leads to balance new feature delivery with reliability, performance and system health.
  • Use data, observability tooling and SRE principles to detect issues early, improve system performance, and reduce operational toil.
  • Lead and mature incident and problem management practices, ensuring strong rootcause analysis, learning, and reduction of MTTF/MTTR.
  • Champion error budgets, SLOs, and reliabilityfirst thinking across your aligned Cloud Labs.
  • Influence platform direction and engineering standards, helping shape how we build resilient cloud services at scale.

What You'll Bring

Core Technical & Engineering Skills

  • Strong cloud engineering background - ideally across GCP and Azure - with experience designing or operating largescale, resilient cloud platforms.
  • Deep understanding of observability tooling (metrics, logs, traces) and how to drive reliability improvements using data.
  • Handson experience of modern SRE practices:
    • SLOs / SLIs
    • Error budgets
    • Reducing toil through automation
    • Production readiness and postmortem best practice

Leadership & Collaboration

  • Experience leading engineering teams and fostering an inclusive, highperforming culture
  • Ability to navigate complex stakeholder groups and communicate technical topics in a clear, accessible way.

Mindset and Behaviours

  • Technologyagnostic, adaptable thinker who selects the best tool or approach for the job.
  • Curiosity and a commitment to continuous learning and improvement - both for yourself and your team.
  • Passion for engineering excellence, platform health, and proactive reliability.

About You

You're someone who:

  • Is passionate about building resilient, observable, customerfocused platforms.
  • Strong understanding of Github pipelines and Terraform Modules
  • Enjoys coaching others, sharing knowledge and shaping engineering culture.
  • Looks for opportunities to remove toil and introduce automation.
  • Thrives in collaborative, multi-functional environments.
  • Adopts new tools, technologies and modern engineering approaches.
  • Values diverse perspectives, psychological safety and inclusive ways of working.

What You'll Get in Return

You'll join a forwardthinking platform organisation that:

  • Is modernising at scale with Cloud, AIenabled operations and realtime observability
  • Encourages innovation, autonomy and engineering craft
  • Invests in colleague development, learning pathways and progression
  • Champions diversity, equity and inclusion across everything we do

You'll help shape the future of cloud operations in one of the UK's largest financial institutions - and your work will have real impact on millions of customers.

We also offer a wide-ranging benefits package, which includes:

  • A generous pension contribution of up to 15%
  • An annual performance-related bonus
  • Share schemes including free shares
  • Benefits you can adapt to your lifestyle, such as discounted shopping
  • 30 days' holiday, with bank holidays on top
  • A range of wellbeing initiatives and generous parental leave policies

Inclusion and Diversity

We're committed to building an inclusive environment where everyone can be themselves and thrive. We value diversity of thought, background and experience, and we actively encourage applications from all communities. If you need reasonable adjustments during the recruitment process, please let us know.

At Lloyds Banking Group, we're driven by a clear purpose; to help Britain prosper. Across the Group, our colleagues are focused on making a difference to customers, businesses and communities. With us you'll have a key role to play in shaping the financial services of the future, whilst the scale and reach of our Group means you'll have many opportunities to learn, grow and develop.

We keep your data safe. So, we'll only ever ask you to provide confidential or sensitive information once you have formally been invited along to an interview or accepted a verbal offer to join us which is when we run our background checks. We'll always explain what we need and why, with any request coming from a trusted Lloyds Banking Group person.

We're focused on creating a values-led culture and are committed to building a workforce which reflects the diversity of the customers and communities we serve. Together we're building a truly inclusive workplace where all of our colleagues have the opportunity to make a real difference.

Requirements

  • Strong cloud engineering background - ideally across GCP and Azure - with experience designing or operating largescale, resilient cloud platforms.
  • Deep understanding of observability tooling (metrics, logs, traces) and how to drive reliability improvements using data.
  • Handson experience of modern SRE practices: SLOs / SLIs, Error budgets, Reducing toil through automation, Production readiness and postmortem best practice
  • Experience leading engineering teams and fostering an inclusive, highperforming culture
  • Ability to navigate complex stakeholder groups and communicate technical topics in a clear, accessible way.

Responsibilities

  • Lead a team of SREs (up to ~15) and create a culture of continuous improvement, learning, and engineering excellence.
  • Work closely with application teams during application migrations to the Cloud
  • Work closely with Product Owners and Engineering Leads to balance new feature delivery with reliability, performance and system health.
  • Use data, observability tooling and SRE principles to detect issues early, improve system performance, and reduce operational toil.
  • Lead and mature incident and problem management practices, ensuring strong rootcause analysis, learning, and reduction of MTTF/MTTR.
  • Champion error budgets, SLOs, and reliabilityfirst thinking across your aligned Cloud Labs.
  • Influence platform direction and engineering standards, helping shape how we build resilient cloud services at scale.

Benefits

pension contributionperformance-related bonusshare schemesdiscounted shoppingholidaywellbeing initiativesparental leave policies

Skills

AzureGCPGithubTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free