Skip to content
mimi

Site Reliability Engineer (SRE) / Application Support Engineer

Rosemallow Technologies Pvt Ltd

India · On-site Full-time Yesterday

About the role

Job Summary

We are looking for a highly motivated SRE / Application Support Engineer to manage and enhance production environments, ensure system reliability, and drive automation across the application lifecycle. The ideal candidate will have strong troubleshooting skills, experience with monitoring tools, and a passion for improving system performance and stability.

Key Responsibilities

  • Plan, manage, and oversee all aspects of the production environment
  • Define and implement strategies for Application Performance Monitoring (APM) and optimization
  • Respond to production incidents and continuously improve platform stability by reducing incident frequency
  • Support and manage code deployments across multiple environments
  • Drive automation initiatives to streamline operational processes
  • Design, develop, and standardize monitoring and alerting mechanisms
  • Perform root cause analysis and ensure faster mean time to recovery (MTTR)
  • Take a holistic approach to troubleshooting across the entire technology stack
  • Collaborate across teams to improve service lifecycle—from design to deployment and operations
  • Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
  • Support pre-production activities like capacity planning, system design consulting, and launch reviews
  • Manage and optimize CI/CD pipelines for smooth software promotion across environments
  • Monitor system performance including availability, latency, and reliability
  • Continuously improve systems through automation and scalability enhancements
  • Work with global teams across multiple time zones
  • Mentor junior team members and share knowledge
  • Participate in on-call rotations and handle occasional off-hours support

Required Skills (Must Have)

  • Strong experience in Linux
  • Proficiency in Shell Scripting
  • Hands‑on experience with ITIL / ITSM processes
  • Strong application troubleshooting skills
  • Knowledge of SQL
  • Experience with Cloud platforms (AWS / Azure / GCP)
  • Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
  • Experience in CI/CD tools (Jenkins)
  • Knowledge of Groovy scripting / YAML
  • Working knowledge of Git / Bitbucket

Preferred Skills (Good to Have)

  • Experience with Configuration Management tools (Ansible / Chef)
  • Understanding of Event‑driven or Microservices architecture
  • Exposure to DevOps best practices and automation frameworks

Soft Skills

  • Strong problem‑solving and analytical thinking
  • Ability to work in a fast‑paced production environment
  • Good communication and collaboration skills
  • Ability to mentor and guide junior engineers

Requirements

  • Strong experience in Linux
  • Proficiency in Shell Scripting
  • Hands-on experience with ITIL / ITSM processes
  • Strong application troubleshooting skills
  • Knowledge of SQL
  • Experience with Cloud platforms (AWS / Azure / GCP)
  • Familiarity with Monitoring tools (e.g., Splunk, Dynatrace)
  • Experience in CI/CD tools (Jenkins)
  • Knowledge of Groovy scripting / YAML
  • Working knowledge of Git / Bitbucket

Responsibilities

  • Plan, manage, and oversee all aspects of the production environment
  • Define and implement strategies for Application Performance Monitoring (APM) and optimization
  • Respond to production incidents and continuously improve platform stability by reducing incident frequency
  • Support and manage code deployments across multiple environments
  • Drive automation initiatives to streamline operational processes
  • Design, develop, and standardize monitoring and alerting mechanisms
  • Perform root cause analysis and ensure faster mean time to recovery (MTTR)
  • Take a holistic approach to troubleshooting across the entire technology stack
  • Collaborate across teams to improve service lifecycle—from design to deployment and operations
  • Analyze ITSM processes and provide feedback to development teams on gaps and resiliency improvements
  • Support pre-production activities like capacity planning, system design consulting, and launch reviews
  • Manage and optimize CI/CD pipelines for smooth software promotion across environments
  • Monitor system performance including availability, latency, and reliability
  • Continuously improve systems through automation and scalability enhancements
  • Work with global teams across multiple time zones
  • Mentor junior team members and share knowledge
  • Participate in on-call rotations and handle occasional off-hours support

Skills

AWSAzureCI/CDCloudDockerDynatraceGCPGitGroovyITILJenkinsLinuxMicroservicesMonitoringSQLShell ScriptingSplunkYAML

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free