Site Reliability Engineer

Equifax

Pimpri-Chinchwad · On-site Full-time Senior 2d ago

About the role

Role Overview

As a Site Reliability Engineer (SRE) at Equifax, you will be responsible for combining software and systems engineering to build and maintain large-scale, fault-tolerant systems. Your primary goal will be to ensure that internal and external services meet or exceed reliability and performance expectations while upholding Equifax engineering principles. You will use a variety of tools and approaches to solve operational problems and play a crucial role in maintaining overall system operation. Your contribution will be instrumental in preventing potential outages and ensuring system uptime across cloud-native and hybrid architectures.

Key Responsibilities

Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Qualifications Required

Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
5-7 years of experience in software engineering, systems administration, database administration, and networking.
2+ years of experience in developing and/or administering software in public cloud environments.
Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
Cross-functional knowledge in systems, storage, networking, security, and databases.
Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
Experience with continuous integration and continuous delivery practices.
Cloud Certification is strongly preferred.

Additional Company Details

The SRE culture at Equifax is characterized by diversity, intellectual curiosity, problem-solving, and openness. The company values collaboration, innovation, and risk-taking in a blame-free environment. Equifax encourages self-direction and meaningful project involvement while providing necessary support and mentorship for continuous learning and growth.

Role Overview

Key Responsibilities

Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Qualifications Required

Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
5-7 years of experience in software engineering, systems administration, database administration, and networking.
2+ years of experience in developing and/or administering software in public cloud environments.
Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
Cross-functional knowledge in systems, storage, networking, security, and databases.
Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
Experience with continuous integration and continuous delivery practices.
Cloud Certification is strongly preferre

Requirements

Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
5-7 years of experience in software engineering, systems administration, database administration, and networking.
2+ years of experience in developing and/or administering software in public cloud environments.
Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
Cross-functional knowledge in systems, storage, networking, security, and databases.
Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
Experience with continuous integration and continuous delivery practices.

Responsibilities

Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Skills

AnsibleAWSBashChefCI/CDDockerGCPGoInfrastructure as CodeJavaJavaScriptJenkinsKubernetesNode.jsPythonTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Site Reliability Engineer

About the role

Role Overview

Key Responsibilities

Qualifications Required

Additional Company Details

Role Overview

Key Responsibilities

Qualifications Required

Requirements

Responsibilities

Skills

Similar roles

Software Developer/Engineer (Freelancer)

Machine Learning Engineer (ML Ops & Pipelines)

Site Reliability Engineer

Don't send a generic resume