Senior Manager, Platform Engineering

Marqeta, Inc.

Remote · Canada Full-time Senior 1mo ago

Apply with a tailored resume Save job

About the role

Location

Capreol

Overview

We're seeking an experienced Senior Manager, Platform Engineering to lead our infrastructure software engineering team responsible for Marqeta's Kubernetes-based compute platform. This is a critical technical leadership role at the heart of our infrastructure, supporting the applications that process millions of payment transactions with ultra-high availability requirements.

You'll drive platform modernization, cost optimization, and operational excellence while leading a team of infrastructure software engineers who build and maintain the foundational systems that enable Marqeta's engineering organization to ship quickly and reliably.

We work Flexible First. This role can be performed remotely in the United States, only in one of our National or Premium locations, which you can review here.

The Impact You’ll Have

Leadership & Team Development

Lead, mentor, and grow a team of infrastructure software engineers focused on Kubernetes platform engineering
Build a culture of innovation, operational excellence, and customer-focused platform development
Recruit top talent and develop career growth paths for team members
Foster collaboration with application development teams, SRE, security, and other infrastructure teams
Drive technical decision-making while empowering engineers to own their solutions
Define and execute the technical roadmap for Marqeta's Kubernetes compute platform
Drive continuous platform modernization to support evolving business needs and scale requirements
Champion platform-as-a-product mindset, treating internal engineering teams as customers
Evaluate and integrate emerging technologies and AWS services to improve platform capabilities
Lead architectural decisions for container orchestration, service mesh, observability, and developer tooling
Develop and implement strategies to optimize Kubernetes infrastructure costs without compromising performance or reliability
Monitor and analyze compute resource utilization, identifying opportunities for right-sizing and efficiency gains
Implement Fin Ops practices including chargeback/showback models, budget alerting, and cost allocation
Drive adoption of cost-effective AWS services and spot instances where appropriate
Partner with engineering teams to optimize application resource requests and limits

Operational Excellence & Availability

Ensure ultra-high availability of the production Kubernetes platform supporting payment processing workloads
Establish SLOs/SLIs for platform reliability and performance
Lead incident response for platform-level issues and drive continuous improvement through blameless postmortems
Implement comprehensive monitoring, alerting, and observability solutions
Balance innovation with stability through disciplined change management and deployment practices
Design and implement CI/CD pipelines and deployment automation for platform infrastructure
Apply software development best practices to infrastructure code (testing, code review, version control)
Drive infrastructure-as-code initiatives using Terraform and other automation tools
Collaborate with security teams to embed security into the platform and SDLC
Enable developer productivity through self‑service capabilities and golden paths

Technical Environment

You’ll be working with technologies including (but not limited to):

CI/CD: Argo
Observability: Data Dog

Who You Are

8+ years of experience in infrastructure engineering, platform engineering, or Dev Ops roles
5+ years of people management experience, leading technical teams through complex initiatives
Deep expertise with Kubernetes in production environments at scale (architecture, operations, troubleshooting)
Extensive Cloud fundamental knowledge (AWS preferred, including EKS, EC2, VPC, IAM, and other core services)
Proven track record of Kubernetes cost optimization and resource efficiency improvements
Strong understanding of SDLC methodologies, CI/CD practices, and infrastructure-as-code
Experience managing ultra-high availability systems (99.99%+ uptime) in production
Proficiency with infrastructure-as-code tools (Terraform, Cloud Formation, etc.)
Hands‑on experience with container technologies, service mesh, and cloud‑native architectures
S…

Skills

AWSAWS CloudFormationAWS EC2AWS EKSAWS IAMAWS VPCArgoDatadogDockerKubernetesTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Senior Manager, Platform Engineering

About the role

Location

Overview

The Impact You’ll Have

Leadership & Team Development

Operational Excellence & Availability

Technical Environment

Who You Are

Skills

Similar roles

Backend-Entwickler*in (w/m/d)

Solution Architect (m/w/x)

Software Architect Java (m/w/d)

Don't send a generic resume