C
Enterprise Incident & Reliability Manager
Centraprise
Mt Laurel Township · On-site Full-time Lead 1mo ago
About the role
About
Incident Manager
Must Have Technical/Functional Skills
- Incident Management
- SRE and operations engineering
- Reliability architecture
- Automation and observability
- Executive communication
Roles & Responsibilities
- Incident Manager - Resources to provide technical leadership for enterprise wide, high severity incidents, problem investigations, and high risk changes, while shaping reliability strategy, governance, and operational standards across complex, distributed platforms.
- Drive Incident resolution management by directing cross functional teams through high impact outages, systemic problem resolution, and large scale change events.
- Creating scripts in ELK, Grafana, AppDynamics, COP
- Auto-executing predefined queries in ELK, Grafana, AppDynamics, COP for real-time issues
- Attaching live query outputs (metrics, logs, traces) directly to alerts/incidents
- Eliminating manual tool navigation for IM and Alert teams
- Enhancing alert systems with contextual intelligence, including metric deviations and anomaly trends, relevant log snippets and patterns, and identifying affected CIs and downstream impacts
Education
- Minimum Graduation
Skills
AppDynamicsCOPELKGrafana
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free