IG
Cloud Infrastructure Manager
Insight Global
Vancouver · Hybrid Full-time Lead $130k – $150k/yr 1w ago
About the role
Overview
A Pacific Northwest–based organization is seeking an experienced Cloud Infrastructure Manager to lead a high-impact team responsible for Azure platform engineering, SRE/observability, reliability, and cloud cost optimization. This role will oversee engineers supporting production systems at scale, with a heavy focus on incident management, monitoring, MTTR reduction, and Azure expansion. The ideal candidate is a hands-on technical leader with deep Azure expertise, strong people leadership skills, and experience operating mission-critical cloud environments.
Key Responsibilities
- Lead and manage a team of Cloud, Platform, and SRE engineers supporting Azure-based infrastructure
- Own reliability, observability, and production health across applications, servers, and networks
- Drive incident response, outage management, root cause analysis, and MTTR reduction initiatives
- Partner with application, infrastructure, and security teams to scale Azure adoption and best practices
- Oversee cloud cost management and optimization (FinOps), including budgeting, forecasting, and spend governance
- Guide platform strategy across Azure Landing Zones, governance, networking, identity, and monitoring
- Ensure alignment with the Azure Well-Architected Framework and cloud reliability standards
- Provide technical leadership while remaining close to the engineering work when needed
Tech Stack
Cloud Platform: Microsoft Azure
- Azure Landing Zones (Azure LZs), Azure CLI, Azure Virtual Desktop (AVD)
- Azure Container Registry (ACR), Azure DevOps (ADO)
- Azure Advisor, Azure Well-Architected Framework
- Azure governance, networking, identity, and monitoring
Observability & SRE Tooling:
- Datadog, SolarWinds (or similar enterprise monitoring platforms)
- Logging, metrics, alerting, incident management, production system monitoring
Infrastructure & Reliability:
- SRE / Platform Engineering practices
- Incident response, outage management, MTTR reduction
Cloud Cost & Optimization:
- FinOps (cloud spend management, budgeting, optimization)
Infrastructure as Code & Automation (team oversight):
- Terraform (core), Azure-native IaC concepts
Skills
ACRADOAzure AdvisorAzure CLIAzure DevOpsAzure Landing ZonesAzure Virtual DesktopDatadogFinOpsIaCMicrosoft AzureSolarWindsTerraform
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free