Skip to content
mimi

Cloud Infrastructure Manager

Insight Global

Vancouver · Hybrid Full-time Lead $130k – $150k/yr 1w ago

About the role

Overview

A Pacific Northwest–based organization is seeking an experienced Cloud Infrastructure Manager to lead a high-impact team responsible for Azure platform engineering, SRE/observability, reliability, and cloud cost optimization. This role will oversee engineers supporting production systems at scale, with a heavy focus on incident management, monitoring, MTTR reduction, and Azure expansion. The ideal candidate is a hands-on technical leader with deep Azure expertise, strong people leadership skills, and experience operating mission-critical cloud environments.

Key Responsibilities

  • Lead and manage a team of Cloud, Platform, and SRE engineers supporting Azure-based infrastructure
  • Own reliability, observability, and production health across applications, servers, and networks
  • Drive incident response, outage management, root cause analysis, and MTTR reduction initiatives
  • Partner with application, infrastructure, and security teams to scale Azure adoption and best practices
  • Oversee cloud cost management and optimization (FinOps), including budgeting, forecasting, and spend governance
  • Guide platform strategy across Azure Landing Zones, governance, networking, identity, and monitoring
  • Ensure alignment with the Azure Well-Architected Framework and cloud reliability standards
  • Provide technical leadership while remaining close to the engineering work when needed

Tech Stack

Cloud Platform: Microsoft Azure

  • Azure Landing Zones (Azure LZs), Azure CLI, Azure Virtual Desktop (AVD)
  • Azure Container Registry (ACR), Azure DevOps (ADO)
  • Azure Advisor, Azure Well-Architected Framework
  • Azure governance, networking, identity, and monitoring

Observability & SRE Tooling:

  • Datadog, SolarWinds (or similar enterprise monitoring platforms)
  • Logging, metrics, alerting, incident management, production system monitoring

Infrastructure & Reliability:

  • SRE / Platform Engineering practices
  • Incident response, outage management, MTTR reduction

Cloud Cost & Optimization:

  • FinOps (cloud spend management, budgeting, optimization)

Infrastructure as Code & Automation (team oversight):

  • Terraform (core), Azure-native IaC concepts

Skills

ACRADOAzure AdvisorAzure CLIAzure DevOpsAzure Landing ZonesAzure Virtual DesktopDatadogFinOpsIaCMicrosoft AzureSolarWindsTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free