RT
Site Reliability Engineer (SRE) / Application Support Engineer
Rosemallow Technologies Pvt Ltd
India · On-site Full-time Yesterday
About the role
Job Summary
We are looking for a highly motivated SRE / Application Support Engineer to manage and enhance production environments, ensure system reliability, and drive automation across the application lifecycle. The ideal candidate will have strong troubleshooting skills, experience with monitoring tools, and a passion for improving system performance and stability.
Key Responsibilities
- Plan, manage, and oversee all aspects of the production environment
- Define and implement strategies for Application Performance Monitoring (APM) and optimization
- Respond to production incidents and continuously improve platform stability by reducing incident frequency
- Support and manage code deployments across multiple environments
- Drive automation initiatives to streamline operational processes
- Design, develop, and standardize monitoring and alerting mechanisms
- Perform root cause analysis and ensure faster mean time to recovery (MTTR)
- Take a holistic approach to troubleshooting across the entire technology stack
- Collaborate across teams to improve service lifecycle—from design to deployment and operations
- Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
- Support pre-production activities like capacity planning, system design consulting, and launch reviews
- Manage and optimize CI/CD pipelines for smooth software promotion across environments
- Monitor system performance including availability, latency, and reliability
- Continuously improve systems through automation and scalability enhancements
- Work with global teams across multiple time zones
- Mentor junior team members and share knowledge
- Participate in on-call rotations and handle occasional off-hours support
Required Skills (Must Have)
- Strong experience in Linux
- Proficiency in Shell Scripting
- Hands‑on experience with ITIL / ITSM processes
- Strong application troubleshooting skills
- Knowledge of SQL
- Experience with Cloud platforms (AWS / Azure / GCP)
- Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
- Experience in CI/CD tools (Jenkins)
- Knowledge of Groovy scripting / YAML
- Working knowledge of Git / Bitbucket
Preferred Skills (Good to Have)
- Experience with Configuration Management tools (Ansible / Chef)
- Understanding of Event‑driven or Microservices architecture
- Exposure to DevOps best practices and automation frameworks
Soft Skills
- Strong problem‑solving and analytical thinking
- Ability to work in a fast‑paced production environment
- Good communication and collaboration skills
- Ability to mentor and guide junior engineers
Requirements
- Strong experience in Linux
- Proficiency in Shell Scripting
- Hands-on experience with ITIL / ITSM processes
- Strong application troubleshooting skills
- Knowledge of SQL
- Experience with Cloud platforms (AWS / Azure / GCP)
- Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
- Experience in CI/CD tools (Jenkins)
- Knowledge of Groovy scripting / YAML
- Working knowledge of Git / Bitbucket
Responsibilities
- Plan, manage, and oversee all aspects of the production environment
- Define and implement strategies for Application Performance Monitoring (APM) and optimization
- Respond to production incidents and continuously improve platform stability by reducing incident frequency
- Support and manage code deployments across multiple environments
- Drive automation initiatives to streamline operational processes
- Design, develop, and standardize monitoring and alerting mechanisms
- Perform root cause analysis and ensure faster mean time to recovery (MTTR)
- Take a holistic approach to troubleshooting across the entire technology stack
- Collaborate across teams to improve service lifecycle—from design to deployment and operations
- Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
- Support pre-production activities like capacity planning, system design consulting, and launch reviews
- Manage and optimize CI/CD pipelines for smooth software promotion across environments
- Monitor system performance including availability, latency, and reliability
- Continuously improve systems through automation and scalability enhancements
- Work with global teams across multiple time zones
- Mentor junior team members and share knowledge
- Participate in on-call rotations and handle occasional off-hours support
Skills
AWSAzureCI/CDCloudDockerDynatraceGCPGitGroovyITILJenkinsLinuxMicroservicesMonitoringSQLShell ScriptingSplunkYAML
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free