L
SRE and Observability Lead
LanceSoft
Markham · On-site Contract Lead CA$40 – CA$45/hr Today
About the role
About
Responsible for developing and leading the company's enterprise observability and reliability capability. The SRE and Observability Lead will collaborate across multiple teams to ensure comprehensive monitoring of all environmental components. This role will designate Dynatrace as the system of record for platform health and apply SRE practices to improve availability, performance, and incident outcomes across applications, infrastructure, and integrations.
A Typical Day
- Own enterprise observability using Dynatrace across cloud, on-prem, ERP, WMS, eCommerce, APIs, and integrations
- Design service topology, dashboards, alerts, and health indicators that reflect business impact
- Apply SRE principles (SLIs, SLOs, error budgets where appropriate) to reduce incidents and improve resilience
- Accelerate incident detection and root-cause analysis, lead post-incident reviews focused on systemic fixes
- Identify reliability, performance, and capacity risks before they impact the business
- Define observability and SRE standards and enable teams to use them effectively
To Land This Opportunity
- You have 5 years in infrastructure, platform, operations, or reliability engineering
- You demonstrate hands-on experience implementing and operating Dynatrace
- You have a strong understanding of distributed systems, cloud/hybrid environments, and integrations
- You have practical experience with SRE or reliability engineering concepts
- You're comfortable operating in high-impact incident and production environments
Skills
APIsDynatraceERPSREWMSclouddistributed systemseCommerceincident managementplatform engineering
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free