SME SRE Observability

Info Way Solutions

US · On-site Full-time Senior 3mo ago

Apply with a tailored resume Save job

About the role

Job Title

SME - SRE Observability Engineer

Location

Minnesota (Onsite - 4 to 5 days/week)

Job Summary

We are seeking an experienced Subject Matter Expert (SME) in Site Reliability Engineering (SRE) with a strong focus on Observability. The ideal candidate will be responsible for designing, implementing, and optimizing observability frameworks to ensure high system reliability, performance, and scalability in a production environment.

Key Responsibilities

Lead the design and implementation of observability solutions including metrics, logging, and tracing.
Act as an SME for SRE best practices, ensuring system reliability, availability, and performance.
Develop and maintain dashboards, alerts, and monitoring strategies.
Collaborate with development, DevOps, and infrastructure teams to improve system visibility.
Perform root cause analysis (RCA) and drive incident resolution.
Optimize system performance and reliability through proactive monitoring.
Implement automation to improve operational efficiency and reduce manual intervention.
Define and track SLIs, SLOs, and SLAs.

Required Skills & Qualifications

Strong experience in Site Reliability Engineering (SRE) concepts and practices.
Deep expertise in Observability tools (e.g., Prometheus, Grafana, ELK Stack, Datadog, Splunk, or similar).
Experience with cloud platforms (AWS, Azure, or GCP).
Proficiency in scripting/programming (Python, Bash, or similar).
Hands-on experience with monitoring, alerting, and logging frameworks.
Strong troubleshooting and performance tuning skills.
Experience with CI/CD pipelines and automation tools.

Preferred Qualifications

Experience working in high-availability, distributed systems.
Knowledge of containerization and orchestration tools (Docker, Kubernetes).
Prior experience as an SRE SME or Lead.

Skills

AWSAzureBashDatadogDockerELK StackGCPGrafanaKubernetesPrometheusPythonSplunk

Similar roles

MCP Engineer / AI Backend Engineer

Ruby Labs

Senior Database Engineer

Glencore AG

Team Leads

imagino

€70k – €110k/yr

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free