Skip to content
mimi

Site Reliability Engineer

General Dynamics - IT

Home · On-site Full-time 3d ago

About the role

GDIT is seeking a Site Reliability Engineer (SRE) to help ensure the resilience, performance, and reliability of mission‑critical Defense systems. In this role, you will blend software engineering, automation, and operations expertise to build scalable platforms, reduce toil, and enable high‑velocity delivery. How You’ll Make an Impact Build/Design and maintain highly available, scalable systems across cloud and on‑prem environments. Develop automation solutions that improves observability, speeds recovery, and eliminates manual operational work. Implement monitoring, alerting, and performance tuning strategies that ensure system health. Collaborate with development and infrastructure teams to design reliable architectures and CI/CD pipelines. Conduct root cause analysis and drive systemic improvements to prevent future incidents. Champion SRE best practices such as SLIs/SLOs, error budgets, and automated incident response. Provide inputs into proposal operations in area of subject matter expertise, collaborating on solution elements and providing written narratives that describe technical solution elements designed for a specific opportunity What You’ll Need to SucceedRequired Work Experience: 15+ years in this space; system reliability, DevSecOps, cloud operations, or infrastructure engineering. Education: Bachelor's with 15 years or an additional 4 years of work experience in lieu of degree Strong scripting and automation skills (Python, Bash, PowerShell, etc.). Hands‑on experience with monitoring tools (Prometheus, Grafana, Splunk, ELK, Datadog, etc.). Familiarity with Kubernetes, container orchestration, and modern CI/CD pipelines. Understanding of networking, Linux system internals, and distributed systems. Ability to troubleshoot complex technical issues across the stack. US Citizenship Required Candidate must possess active secret to start, and ability to attain Top Secret/SCI Preferred Experience supporting DoD or other federal programs. Certifications such as Kubernetes (CKA/CKAD), AWS/Azure, or ITIL. Experience implementing SRE frameworks at scale. Location & TravelLocation:Remote Travel= 25-50% Expected

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free