Skip to content
mimi

Resiliency and Recovery Engineer - Tech Lead

Jobs via Dice

Charlotte · On-site Full-time Lead 1w ago

About the role

About

PROLIM Global Corporation is seeking the following.

Responsibilities

  • Own end‑to‑end application reliability, availability, and performance for client‑critical systems.
  • Define and govern SLIs, SLOs, and error budgets aligned with business and regulatory expectations.
  • Lead production support and incident management, acting as Incident Commander for P1/P2 issues.
  • Ensure robust monitoring, alerting, logging, and observability across application landscapes.
  • Drive automation and self‑healing to reduce manual toil and improve operational efficiency.
  • Partner with development and DevOps teams to embed SRE practices into CI/CD and release pipelines.
  • Oversee change and release readiness, ensuring risk‑based production deployments.
  • Provide on‑site client leadership, serving as the primary SRE point of contact and trusted advisor.
  • Conduct and govern post‑incident reviews (RCA/PIR) and ensure preventive actions are implemented.
  • Ensure compliance with security, audit, and regulatory controls relevant to the client environment.
  • Lead and mentor onshore and offshore SRE/support teams, ensuring SLA adherence and skill uplift.
  • Report operational KPIs, reliability trends, and improvement roadmaps to client and internal leadership.

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free