Site Reliability Engineer (SRE) / Application Support Engineer

Rosemallow Technologies Pvt Ltd

India · On-site Full-time Yesterday

About the role

Job Summary

We are looking for a highly motivated SRE / Application Support Engineer to manage and enhance production environments, ensure system reliability, and drive automation across the application lifecycle. The ideal candidate will have strong troubleshooting skills, experience with monitoring tools, and a passion for improving system performance and stability.

Key Responsibilities

Plan, manage, and oversee all aspects of the production environment
Define and implement strategies for Application Performance Monitoring (APM) and optimization
Respond to production incidents and continuously improve platform stability by reducing incident frequency
Support and manage code deployments across multiple environments
Drive automation initiatives to streamline operational processes
Design, develop, and standardize monitoring and alerting mechanisms
Perform root cause analysis and ensure faster mean time to recovery (MTTR)
Take a holistic approach to troubleshooting across the entire technology stack
Collaborate across teams to improve service lifecycle—from design to deployment and operations
Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
Support pre-production activities like capacity planning, system design consulting, and launch reviews
Manage and optimize CI/CD pipelines for smooth software promotion across environments
Monitor system performance including availability, latency, and reliability
Continuously improve systems through automation and scalability enhancements
Work with global teams across multiple time zones
Mentor junior team members and share knowledge
Participate in on-call rotations and handle occasional off-hours support

Required Skills (Must Have)

Strong experience in Linux
Proficiency in Shell Scripting
Hands‑on experience with ITIL / ITSM processes
Strong application troubleshooting skills
Knowledge of SQL
Experience with Cloud platforms (AWS / Azure / GCP)
Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
Experience in CI/CD tools (Jenkins)
Knowledge of Groovy scripting / YAML
Working knowledge of Git / Bitbucket

Preferred Skills (Good to Have)

Experience with Configuration Management tools (Ansible / Chef)
Understanding of Event‑driven or Microservices architecture
Exposure to DevOps best practices and automation frameworks

Soft Skills

Strong problem‑solving and analytical thinking
Ability to work in a fast‑paced production environment
Good communication and collaboration skills
Ability to mentor and guide junior engineers

Requirements

Strong experience in Linux
Proficiency in Shell Scripting
Hands-on experience with ITIL / ITSM processes
Strong application troubleshooting skills
Knowledge of SQL
Experience with Cloud platforms (AWS / Azure / GCP)
Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
Experience in CI/CD tools (Jenkins)
Knowledge of Groovy scripting / YAML
Working knowledge of Git / Bitbucket

Responsibilities

Plan, manage, and oversee all aspects of the production environment
Define and implement strategies for Application Performance Monitoring (APM) and optimization
Respond to production incidents and continuously improve platform stability by reducing incident frequency
Support and manage code deployments across multiple environments
Drive automation initiatives to streamline operational processes
Design, develop, and standardize monitoring and alerting mechanisms
Perform root cause analysis and ensure faster mean time to recovery (MTTR)
Take a holistic approach to troubleshooting across the entire technology stack
Collaborate across teams to improve service lifecycle—from design to deployment and operations
Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
Support pre-production activities like capacity planning, system design consulting, and launch reviews
Manage and optimize CI/CD pipelines for smooth software promotion across environments
Monitor system performance including availability, latency, and reliability
Continuously improve systems through automation and scalability enhancements
Work with global teams across multiple time zones
Mentor junior team members and share knowledge
Participate in on-call rotations and handle occasional off-hours support

Skills

AWSAzureCI/CDCloudDockerDynatraceGCPGitGroovyITILJenkinsLinuxMicroservicesMonitoringSQLShell ScriptingSplunkYAML

Similar roles

Senior AI / ML Engineer (w/m/d) für Technologieberatung im Bankingumfeld - Frankfurt am Main

Passion for People GmbH

Senior Cloud Infrastructure & Security Engineer

G82 labs pvt ltd

Senior Software Engineer - AI Agent Platform

Seedtag

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free