SRE & DevOps Engineer
algoleap
About the role
Job Title: SRE and DevOps Engineer
Job Location: Hyderabad /Gurugram/Noida
Start Date: As soon as possible
Key Responsibilities
• Work closely with development, operations, and product teams to ensure monitoring solutions align with business goals.
• Create and maintain scripts and automation tools to streamline monitoring and alerting processes
• Produce and maintain clear documentation on monitoring setups, best practices, and troubleshooting procedures.
• Train team members and stakeholders on effective use and management of Datadog tools and features.
• Monitor the performance and availability of software systems, identify and resolve issues, and implement proactive measures to prevent future incidents.
• Design and maintain fault-tolerant architectures using redundancy, load balancing, and automated failover mechanisms to minimize downtime and ensure seamless service availability.
• Develop and implement automation strategies to reduce manual intervention and improve system reliability.
• Optimize system performance through proactive monitoring and tuning.
• Prepare and execute disaster recovery plans to ensure business continuity.
• Work closely with development and operations teams to bridge the gap between them, ensuring smooth deployment and operation of applications.
Incident Management
• Follow incident management process, ensuring timely resolution and minimizing service disruptions.
• Conduct root cause analysis and implement preventive measures to reduce recurring incidents.
• Develop and maintain incident response procedures and communication protocols.
Change Management
• Manage the change management process, ensuring controlled and efficient implementation of changes
• Assess the impact of proposed changes and mitigate potential risks.
• Ensure compliance with change management policies and procedures.
Metrics And Eporting
• Generate regular reports and dashboards to provide insights into service performance.
• Use data-driven insights to identify trends and drive continuous improvement.
Transformation And Automation
• Identify opportunities for process automation and implement solutions to improve efficiency.
• Evaluate and implement new monitoring tools
Key Requirements
• Proven expertise in multiple monitoring tools
• Minimum of 8 years of experience in monitoring and DevOps skills.
• Proficiency in scripting, coding and software development principles
• Strong understanding of IT operations and system management.
• Strong experience with automation tools and frameworks.
• Excellent troubleshooting and problem-solving skills.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free