H
Reliability Engineer
HCLTech
Alticane · On-site Full-time Lead 1mo ago
About the role
About
Lead operational excellence as an AWS-focused Reliability Engineer. Ensure service reliability through effective incident management and proactive maintenance across hosted services.
This role focuses on supporting around 1000 AWS-hosted services, managing reliability with a 24×7 dedication. You will drive MTTA/MTTR improvements, conduct preventive checks, and analyze trends for automation opportunities. Collaborating closely with Cloud Engineering and application teams is key to implementing best practices.
Key Responsibilities
- Oversee 24×7 monitoring and incident management
- Perform preventive health checks and service analysis
- Facilitate blameless postmortems and maintain reliability dashboards
- Configure observability tools like Cloud Watch and Dynatrace
- Support service requests and AWS transitions appropriately
Requirements
- 5+ years of cloud-native SRE or Dev Ops experience
- Expertise in managing AWS production environments
- Proven track record in incident and change management
- Hands-on experience with observability tools
- Excellent written and documentation capabilities
Elevate service reliability through skilled incident management and effective collaboration in AWS environments.
Skills
AWSCloud WatchDevOpsDynatraceSRE
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free