Skip to content
mimi

AWS SRE Engineer

Marks Sattin

Glasgow · On-site Full-time Senior 2w ago

About the role

Overview

We’re hiring an experienced AWS SRE Engineer to lead observability for a cloud platform. The role focuses on building and maintaining actionable Grafana dashboards, defining and measuring reliability (SLIs/SLOs/SLAs), owning alerting strategy, and driving improvements to platform resilience. This is an opportunity to shape operational excellence and influence engineering decisions across the stack.

What you’ll do (key responsibilities)

  • Design, build and maintain Grafana dashboards that deliver actionable insights into performance, availability and capacity.
  • Implement and improve observability for AWS-hosted applications and infrastructure (metrics, logs, traces).
  • Define and track SLIs, SLOs and SLAs; manage error budgets and translate reliability targets into engineering priorities.
  • Monitor using golden signals and operate an effective, noise-aware alerting strategy.
  • Support incident response, run RCA processes and drive continuous reliability improvements.
  • Embed observability into CI/CD and cloud operations; collaborate with platform, engineering and ops teams to improve operational efficiency.

Must-have skills and experience

  • 6+ years in SRE, Cloud Reliability or Cloud Operations roles.
  • Strong, hands-on AWS experience.
  • Proven expertise building Grafana dashboards and working in observability/monitoring stacks.
  • Solid understanding of SRE fundamentals (SLA, SLO, SLI, error budgets, golden signals).
  • Track record troubleshooting production systems and improving platform reliability.
  • Strong communicator and team collaborator.

Nice-to-have

  • Experience with Snowflake or Databricks.
  • Familiarity with IaC, automation and cloud-native operational tooling.

Skills

AWSGrafanaSRE

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free