Skip to content
mimi

Enterprise Incident & Reliability Manager

Centraprise

Mt Laurel Township · On-site Full-time Lead 1mo ago

About the role

About

Incident Manager

Must Have Technical/Functional Skills

  • Incident Management
  • SRE and operations engineering
  • Reliability architecture
  • Automation and observability
  • Executive communication

Roles & Responsibilities

  • Incident Manager - Resources to provide technical leadership for enterprise wide, high severity incidents, problem investigations, and high risk changes, while shaping reliability strategy, governance, and operational standards across complex, distributed platforms.
  • Drive Incident resolution management by directing cross functional teams through high impact outages, systemic problem resolution, and large scale change events.
  • Creating scripts in ELK, Grafana, AppDynamics, COP
  • Auto-executing predefined queries in ELK, Grafana, AppDynamics, COP for real-time issues
  • Attaching live query outputs (metrics, logs, traces) directly to alerts/incidents
  • Eliminating manual tool navigation for IM and Alert teams
  • Enhancing alert systems with contextual intelligence, including metric deviations and anomaly trends, relevant log snippets and patterns, and identifying affected CIs and downstream impacts

Education

  • Minimum Graduation

Skills

AppDynamicsCOPELKGrafana

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free