All jobs

Senior Technology Site Reliability Engineer

Cooley LLP

US · On-site Full-time Senior $140k – $205k/yr 3mo ago

Apply with a tailored resume Save job

About the role

Senior Technology Site Reliability Engineer

Cooley is seeking a Senior Site Reliability Engineer to join the Infrastructure & Development Operations team.

Position Summary

The Senior Technology Site Reliability Engineer ("SRE") is responsible for ensuring the reliability, scalability, and performance of the firm's critical infrastructure and applications. The SRE blends software engineering with systems engineering to build and maintain automated, resilient, and observable systems that support high availability and operational excellence. In addition to being technically advanced, the SRE will have a high degree of emotional intelligence and the ability to work as a team towards complex and layered objectives.

Responsibilities

Monitor and maintain production systems to ensure high availability and performance
Implement and manage service-level indicators (SLIs), objectives (SLO's), agreements (SLA's), and error budgets
Participate in on-call rotations and incident response, including root cause analysis and postmortems
Develop and maintain infrastructure as code (IaC) using Terraform
Automate deployment, scaling, and recovery processes to reduce manual intervention
Partner with DevOps to build and maintain CI/CD pipelines to support safe and efficient software delivery
Implement observability solutions using metrics, logs, traces, and alerting systems (Prometheus, Grafana, DataDog, etc.)
Proactively identify and resolve system bottlenecks and reliability risks
Work closely with Infrastructure, DevOps, Development, and security teams to embed reliability into the development lifecycle
Contribute to a culture of blameless post-mortems and continuous improvement
Document procedures and share knowledge across teams
All other duties as assigned or required

Skills and Experience

Required

After orientation at Cooley LLP, exhibit proficiency in the Microsoft Office suite, iManage and other firm applications
Ability to work extended and/or weekend hours, as required
Ability to travel, as required
6+ years direct applicable experience (e.g. site reliability engineering or related field)
Proficiency in Terraform and programming languages such as Python, Go, or Java
Deep expertise in cloud platforms, particularly AWS, and container orchestration
Strong background in distributed systems, performance tuning, and automation
Hands‑on experience with configuration management tools such as Puppet, Chef, or Salt

Preferred

Bachelor's Degree in Computer Science, Information Technology, Engineering, or associated discipline
Experience working with advanced ETL data workflows including technologies such as AWS EMR, Azure Synapse, Azure Data Factory, or Apache Hive/Spark/Airflow
Experience with IaC deployment of AKS/EKS/GKE architecture
Experience with enterprise Data Lake environments using technologies such as DataBricks or Snowflake

Competencies

Expert analytical/quantitative, problem‑solving, and deductive reasoning skills, experience performing advanced troubleshooting and root cause analysis of complex technical issues
Excellent organizational, planning, and time management skills and ability to work independently and in a team environment to manage competing priorities and meet deadlines
Advanced verbal and written communication skills with the ability to present findings, conclusions, alternatives, and information clearly and concisely
Experience working with all levels of business professionals, management, stakeholders, and vendors with the ability to build effective relationships through trust and diplomacy

Compensation & Benefits

Expected annual pay range for this full‑time position: $140,000 – $205,000 (final offer dependent on geographic location, applicable experience, and skillset)
Competitive compensation and excellent benefits package
Full range of elective benefits including medical, health savings account (with applicable medical plan), dental, vision, health and/or dependent care flexible spending accounts, pre‑tax commuter benefits, life insurance, AD&D, long‑term care coverage, backup care for children and/or adults, and other parental support benefits
Firm‑paid life insurance, AD&D, LTD, short‑term medical benefits
21 days of Paid Time Off (PTO) and 10 paid holidays each year
Generous parental leave and fertility benefits
Detailed benefit orientation for new employees

Equal Opportunity Employer

Cooley offers a competitive compensation and excellent benefits package and is committed to fair and equitable employment practices. EOE.

Skills

AWSAWS EMRAD&DApache AirflowApache SparkAzure Data FactoryAzure SynapseChefCI/CDDataBricksDockerETLGoGrafanaGKEHiveIaCiManageJavaKubernetesLTDMicrosoft OfficePrometheusPuppetPythonSaltSite Reliability EngineeringSnowflakeTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Senior Technology Site Reliability Engineer

About the role

Senior Technology Site Reliability Engineer

Position Summary

Responsibilities

Skills and Experience

Required

Preferred

Competencies

Compensation & Benefits

Equal Opportunity Employer

Skills

Similar roles

Mid-Level IoT Engineer

AI Forward Deploy Engineer

Software Engineer

Don't send a generic resume