Site Reliability Engineer III

CSC Holdings LLC

Plano · On-site Full-time Senior 2mo ago

About the role

Job Summary

As a Site Reliability Engineer III, you will be a primary driver in the long-term management and stabilization of our Hybrid Cloud infrastructure. We maintain a permanent dual-hosting strategy, operating both Google Cloud Platform (GCP) and mission-critical On-Premises Unix/Linux footprint. You will bridge the gap between physical hardware and modern cloud-native operations, applying software engineering principles to ensure our systems are scalable, secure, and predictable across all platforms.

The Mission: Hybrid Reliability & Stabilization

Your mission is to unify our GCP and On-Premises environments into a single, reliable platform. Your first 12 months will focus on Stabilization and Observability. You will lead the transition away from "toil" (manual, repetitive operations) toward high-leverage automation, aggressively addressing on-prem technical debt while implementing modern SRE practices across our global data centers and cloud projects.

Responsibilities

Hybrid Platform Standardization: Audit, harden, and standardize Unix (Solaris/AIX) and Linux (RHEL/Ubuntu) environments across both GCP Compute Engine and physical bare-metal servers.
Infrastructure Stewardship (DC Support): Serve as the engineering lead for our Eastern U.S. data centers; ensure hardware health, power redundancy, and physical security standards are enforced through code and automated checks.
Storage Engineering (Specialization): Architect and manage enterprise-grade SAN/NAS environments alongside GCP Cloud Storage/Persistent Disk. Optimize for low latency and high IOPS while ensuring all data-at-rest complies with our Annual Encryption Strategy.
Automation of Toil: Design and maintain robust automation pipelines (Ansible, Terraform, Python) to ensure configuration parity and eliminate drift between cloud and on-premises environments.
Vulnerability Management: Transition the fleet from a "vulnerable" state to a "reliable" one by establishing a sustainable, automated monthly patching cadence.
Unified Observability: Implement and scale a "single pane of glass" monitoring stack (Prometheus, Grafana, Loki) to provide real-time health metrics for the entire hybrid estate.
Incident Response & Post-Mortems: Participate in a sustainable on-call rotation. Lead Blameless Post-Mortems for incidents involving cross-platform dependencies to ensure we "fix the system, not the person."

Qualifications

Technical Requirements (SRE3)

OS Internals: Deep proficiency in Linux (RHEL/Ubuntu) and Unix (Solaris/AIX) administration and kernel tuning
Cloud Proficiency: Hands-on experience with GCP (IAM, VPC, Compute Engine) or equivalent public cloud providers
Infrastructure as Code: Proven ability to manage complex environments using Terraform and Ansible
Storage Protocols: Proficiency in Fiber Channel, iSCSI, and NFS. Experience with enterprise arrays (NetApp, Dell/EMC, or Pure Storage) is highly preferred
Software Engineering: Strong scripting ability in Python or Go to build internal tools and automation.
Security: Strong understanding of CVE lifecycles and cryptographic standards (AES-256)

The Ideal Candidate

Bachelor’s degree in Telecommunications, Computer Engineering, or related discipline
6+ years of experience in IP networking and infrastructure support, with at least 4 years in reliability-focused roles

Skills

AIXAnsibleAES-256Compute EngineCVEDell/EMCDockerFiber ChannelGCPGoGrafanaHybrid CloudIAMInfrastructure as CodeIP networkingiSCSILinuxLokiNetAppNFSOn-PremisesOS InternalsPrometheusPure StoragePythonRHELSANSolarisTerraformUnixUbuntuVPC

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Site Reliability Engineer III

About the role

Job Summary

The Mission: Hybrid Reliability & Stabilization

Responsibilities

Qualifications

Technical Requirements (SRE3)

The Ideal Candidate

Skills

Similar roles

MCP Engineer / AI Backend Engineer

Senior Database Engineer

Team Leads

Don't send a generic resume