Skip to content
mimi

Director of Platform Engineering

MeeruAI

Pleasanton · On-site Full-time Executive 1w ago

About the role

Title: Director of Platform Engineering

Location: Remote (US Preferred)

Reports to: Head of Engineering

Team Size: 5–10 initially, scaling to 15+

Critical Hire Timeline: Week 2–3 (platform foundation required for Maverick launch)

Position Overview

We are seeking an exceptional Director of Platform Engineering to serve as a strategic partner to the Head of Engineering and drive technical excellence, operational efficiency, and business impact across the entire engineering organization.

This is a dual-mandate leadership role that combines: • Platform & Infrastructure ownership • Engineering Operations leadership • Strategic business partnership • Organizational excellence and execution rigor

This role goes far beyond traditional DevOps or Platform leadership. You will be the right hand to the Head of Engineering, owning day-to-day operational excellence while enabling product velocity, AI scalability, and enterprise-grade reliability.

Role Scope & Accountability

Area

Ownership

Platform & Infrastructure

40%

Engineering Operations

30%

Strategic Business Partnership

20%

Organizational Excellence

10%

Leadership & Collaboration Expectations (Non-Negotiable) • Resolve disagreements privately, present aligned positions publicly • No unaligned executive escalations • Transparent risk communication with aligned mitigation plans • Platform team must be viewed as an enabler, not a blocker • Influence through trust, data, and partnership, not authority

Key Responsibilities

I. Platform & Infrastructure Leadership (40%)

Cloud, Architecture & Scalability • Own AWS infrastructure strategy (EKS, RDS, VPC, IAM, networking) • Define multi-tenant SaaS patterns (shared DB + RLS, silo for enterprise) • Scale platform from 10 → 100 → 500+ customers • Drive vendor evaluation and build vs. buy decisions • Ensure reliability, performance, security, and cost efficiency

AI / ML Infrastructure & MLOps (Critical)

Self-Hosted LLM Infrastructure • Deploy and operate self-hosted SLMs for privacy and cost efficiency • GPU infrastructure (AWS P4/G5, 8–16 GPUs) • Model serving: vLLM, TGI, Ray Serve • Fine-tuning pipelines (LoRA, QLoRA) • Quantization (4-bit / 8-bit) and autoscaling

Model Deployment & APIs • Deploy predictive ML models (forecasting, classification, anomaly detection) • Real-time inference (<100ms p95) and batch pipelines • CI/CD for models with canary and blue-green deployments • Drift detection, accuracy tracking, rollback

AI Cost Management & Pricing Enablement • Token and GPU cost tracking per tenant and feature • Unit economics for AI workloads • API vs self-hosted break-even modeling • Prompt caching, response caching, batching strategies (40–60% savings)

AI Observability & SLOs • LLM latency (p50/p95/p99), success rates, token usage • Agent performance (completion rate, tool success, latency) • RAG quality metrics and retrieval accuracy • Cost anomaly detection and alerting

DevOps, Reliability & SRE • Build and scale DevOps/SRE team (3–4 → 8–10) • CI/CD with <10 min deploys, GitOps (ArgoCD / Flux) • Define SLAs/SLOs (99.9% uptime target) • Incident response, blameless postmortems, MTTR/MTTD tracking • Disaster recovery and business continuity planning

Security & Compliance • Own SOC 2 Type I & II, GDPR, HIPAA readiness • Zero Trust security architecture • Vulnerability management and pen testing • Security team hiring and security champions program • Incident response and forensics

Cloud Cost Optimization (FinOps) • Own AWS + AI budget ($50K → $500K+/month) • Reserved instances, spot strategies, right-sizing • Cost allocation by tenant and team • Target: 20% YoY cost reduction

II. Engineering Operations Leadership (30%)

Talent & People Operations • Hiring strategy for 33–44 engineers in Year 1 • Build offshore development centers (India / Eastern Europe) • Own performance reviews, promotions, PIPs, exits • Define career ladders, leveling, and compensation bands • Coach managers and directors

Engineering Productivity & Tools • Own dev tooling: GitHub, CI/CD, Jira/Linear, Notion, Datadog • Track DORA metrics, cycle time, developer NPS • Reduce friction via automation and internal platforms • Vendor management and SaaS consolidation

Process & Execution Excellence • Agile ceremonies, RFCs, architecture reviews • Release management and dependency coordination • Executive dashboards and KPI reporting • Conflict resolution via private alignment and consensus

III. Strategic Business Partner (20%)

Platform & AI Pricing Strategy • Define SaaS + AI pricing tiers (Starter / Pro / Enterprise) • Usage-based AI pricing (queries, tokens, agents) • Gross margin modeling (>70% infra, >60% AI) • Cost-to-serve and break-even analysis

Financial Planning & Advisory • Engineering budget ownership ($4.7M–$6M) • Headcount and infrastructure forecasting • ROI analysis for infrastructure investments • Vendor negotiation (AWS, Datadog, Auth0, LLM providers)

Strategic Leadership • Identify blindspots proactively • Quarterly and annual planning partner to Head of Engineering • Support Sales on enterprise deals and security reviews • Board and investor-facing technical leadership

IV. Organizational Excellence (10%) • Define and reinforce engineering culture and values • Knowledge management, documentation, onboarding playbooks • Executive communication and board-level reporting • High-trust, high-performance environment

Required Qualifications

Technical • 12+ years engineering experience, 6+ years leadership • AWS at scale (EKS, RDS, VPC, IAM) • Kubernetes, Terraform, CI/CD, DevSecOps • Required: Self-hosted LLMs, GPU infra, MLOps in production • AI cost optimization and observability experience • Security and compliance leadership (SOC 2, GDPR, HIPAA)

Operational & Business • Led 30–50+ person engineering orgs • Hiring, performance management, and org design • $5M+ engineering budgets • SaaS unit economics and pricing strategy • Executive-level communication and diplomacy

Preferred Qualifications • VP Engineering experience at Series A/B/C startup • Large-scale AI/GPU deployments (100+ GPUs) • Fintech or regulated domain experience • Offshore center build-out experience • MBA or executive leadership training

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free