IT
Engineer Sr Lead, Site Reliability (MS SQL/Azure App Services Focused)
Inficare Technologies
Remote · US Full-time Senior Yesterday
About the role
Engineer Sr Lead, Site Reliability (MS SQL/Azure App Services-Focused)
Location: Atlanta, GA (role is remote, but client wants someone located in Atlanta, GA)
Position Overview
The Senior Lead Site Reliability Engineer ensures the reliability, performance, and resilience of mission-critical systems with a strong focus on MS SQL and Azure App Services platforms. This role blends software engineering, cloud operations, and reliability engineering to optimize SQL workloads, reduce operational risk, and improve service availability across distributed environments.
Responsibilities
- Lead SRE practices for systems leveraging MS SQL and Azure App Services at enterprise scale, defining standards for reliability, performance optimization, and operational automation.
- Architect and implement highly available, fault-tolerant infrastructure supporting MS SQL and Azure App Services workloads.
- Develop observability frameworks (monitoring, alerting, logging) tailored to MS SQL and Azure App Services performance, cost monitoring, query optimization, and system health.
- Partner with data engineering, development, and architecture teams to embed reliability into SQL schema design, workload management, job orchestration, and CI/CD deployments.
- Drive incident management for SQL and Azure related issues, performing root-cause analysis and implementing durable, systemic remediation.
- Reduce operational toil through engineering automation, improving stability and responsiveness of SQL workloads and dependent services.
- Guide teams in adopting reliability, automation, and SQL/Azure operational best practices.
Qualifications
- Extensive SRE or production engineering experience supporting large-scale, cloud-based systems.
- Expertise in automation, observability tooling, cloud engineering (Azure preferred), and distributed systems.
- Experience leading reliability initiatives and supporting complex data and application ecosystems.
Requirements
- Extensive SRE or production engineering experience supporting large-scale, cloud-based systems
- Experience leading reliability initiatives and supporting complex data and application ecosystems
Responsibilities
- The Senior Lead Site Reliability Engineer ensures the reliability, performance, and resilience of mission-critical systems with a strong focus on MS SQL and Azure App Services platforms
- This role blends software engineering, cloud operations, and reliability engineering to optimize SQL workloads, reduce operational risk, and improve service availability across distributed environments
- Lead SRE practices for systems leveraging MS SQL and Azure App Services at enterprise scale, defining standards for reliability, performance optimization, and operational automation
- Architect and implement highly available, fault-tolerant infrastructure supporting MS SQL and Azure App Services workloads
- Develop observability frameworks (monitoring, alerting, logging) tailored to MS SQL and Azure App Services performance, cost monitoring, query optimization, and system health
- Partner with data engineering, development, and architecture teams to embed reliability into SQL schema design, workload management, job orchestration, and CI/CD deployments
- Drive incident management for SQL and Azure related issues, performing root-cause analysis and implementing durable, systemic remediation
- Reduce operational toil through engineering automation, improving stability and responsiveness of SQL workloads and dependent services
- Guide teams in adopting reliability, automation, and SQL/Azure operational best practices
Skills
Azure App ServicesAzureCI/CDMS SQLSQL
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free