Skip to content
mimi

Site Reliability Engineer

Equifax

Pimpri-Chinchwad ยท On-site Full-time Senior 2d ago

About the role

Role Overview

As a Site Reliability Engineer (SRE) at Equifax, you will be responsible for combining software and systems engineering to build and maintain large-scale, fault-tolerant systems. Your primary goal will be to ensure that internal and external services meet or exceed reliability and performance expectations while upholding Equifax engineering principles. You will use a variety of tools and approaches to solve operational problems and play a crucial role in maintaining overall system operation. Your contribution will be instrumental in preventing potential outages and ensuring system uptime across cloud-native and hybrid architectures.

Key Responsibilities

  • Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
  • Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
  • Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
  • Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
  • Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
  • Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Qualifications Required

  • Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
  • 5-7 years of experience in software engineering, systems administration, database administration, and networking.
  • 2+ years of experience in developing and/or administering software in public cloud environments.
  • Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
  • Cross-functional knowledge in systems, storage, networking, security, and databases.
  • Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
  • Experience with continuous integration and continuous delivery practices.
  • Cloud Certification is strongly preferred.

Additional Company Details

The SRE culture at Equifax is characterized by diversity, intellectual curiosity, problem-solving, and openness. The company values collaboration, innovation, and risk-taking in a blame-free environment. Equifax encourages self-direction and meaningful project involvement while providing necessary support and mentorship for continuous learning and growth.

Role Overview

As a Site Reliability Engineer (SRE) at Equifax, you will be responsible for combining software and systems engineering to build and maintain large-scale, fault-tolerant systems. Your primary goal will be to ensure that internal and external services meet or exceed reliability and performance expectations while upholding Equifax engineering principles. You will use a variety of tools and approaches to solve operational problems and play a crucial role in maintaining overall system operation. Your contribution will be instrumental in preventing potential outages and ensuring system uptime across cloud-native and hybrid architectures.

Key Responsibilities

  • Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
  • Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
  • Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
  • Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
  • Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
  • Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Qualifications Required

  • Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
  • 5-7 years of experience in software engineering, systems administration, database administration, and networking.
  • 2+ years of experience in developing and/or administering software in public cloud environments.
  • Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
  • Cross-functional knowledge in systems, storage, networking, security, and databases.
  • Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
  • Experience with continuous integration and continuous delivery practices.
  • Cloud Certification is strongly preferre

Requirements

  • Bachelor's degree in Computer Science or a related technical field involving coding, or equivalent job experience.
  • 5-7 years of experience in software engineering, systems administration, database administration, and networking.
  • 2+ years of experience in developing and/or administering software in public cloud environments.
  • Proficiency in programming languages like Python, Bash, Java, Go, JavaScript, and/or node.js.
  • Cross-functional knowledge in systems, storage, networking, security, and databases.
  • Strong system administration skills including automation and orchestration using tools like Terraform, Chef, Ansible, and containers (Docker, Kubernetes).
  • Experience with continuous integration and continuous delivery practices.

Responsibilities

  • Manage system uptime for cloud-native (AWS, GCP) and hybrid architectures.
  • Develop infrastructure as code (IAC) patterns meeting security and engineering standards using technologies such as Terraform, cloud CLI scripting, and cloud SDK programming.
  • Design and implement CI/CD pipelines for application and cloud architecture patterns utilizing tools like Jenkins and cloud-native toolchains.
  • Create automated tooling for deploying service requests and build comprehensive runbooks for managing, detecting, remediating, and restoring services.
  • Troubleshoot complex distributed architecture service maps and be on-call for high severity incidents, improving runbooks to enhance mean time to resolve (MTTR).
  • Lead blameless postmortems on availability issues and drive actions to prevent recurrences.

Skills

AnsibleAWSBashChefCI/CDDockerGCPGoInfrastructure as CodeJavaJavaScriptJenkinsKubernetesNode.jsPythonTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free