Site Reliability Engineer ( SRE )
New Era Technology
About the role
• *Position Type: Contract** • *Duration: 12 months** • *Shift Timing:12:00 PM 8:00 PM IST** • *PRIMARY DUTIES:**
We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) with 3+ years of experience to ensure high availability, reliability, and performance of production systems. This role focuses on automation, observability, incident management, and cross-team coordination to drive operational excellence.
• *Key Responsibilities** • Maintain reliable, scalable, and secure production environments. • Implement and manage monitoring, alerting, and logging solutions. • Contribute to defining and tracking SLIs/SLOs and support error budget practices. • Automate operational tasks to improve efficiency and reduce manual effort. • Perform troubleshooting and Root Cause Analysis (RCA) for production incidents. • Optimize system performance, availability, and capacity. • Maintain runbooks, SOPs, and incident documentation in Confluence. • Adhere to change management, deployment governance, and disaster recovery standards. • Support incident response for critical production services. • *Collaboration & Tools** • Coordinate with external vendors and internal cross-functional teams. • Work closely with Engineering, Product Owners, and Operations teams. • Manage incidents and changes using ServiceNow & JIRA. • Collaborate through Slack and structured communication channels**.** • *Technical Skills** • *Systems & Cloud** • Strong knowledge of Windows and Linux/Unix systems. • Solid understanding of networking fundamentals (DNS, TCP/IP, Load Balancing, Firewalls). • Experience with at least one cloud platform (AWS, Azure, or GCP). • *Automation & CI/CD** • Proficiency in one scripting/programming language (Python, Go, Bash, PowerShell, or Java). • Understanding of CI/CD pipelines and automation practices. • *Containers & Observability** • Hands-on experience with Docker and Kubernetes. • Experience with monitoring tools such as Grafana or Power BI. • Ability to analyze logs, metrics, and traces for troubleshooting. • *ITSM & Documentation** • Experience with ServiceNow & JIRA (incident/change/problem workflows). • Working knowledge of Confluence for technical documentation and knowledge management. • *Additional Experience (Preferred)** • Background in DevOps, Cloud Engineering, or Platform Engineering. • Understanding of security best practices and compliance standards. • Familiarity with AI-assisted engineering tools (Claude Code, Jellyfish, GitHub Copilot). • Exposure to large-scale or production-grade systems.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free