Director of Platform Engineering

MeeruAI

Remote · US Full-time Executive 1mo ago

About the role

Position Overview

We are seeking an exceptional Director of Platform Engineering to serve as a strategic partner to the Head of Engineering and drive technical excellence, operational efficiency, and business impact across the entire engineering organization.

This is a dual-mandate leadership role that combines:

Platform & Infrastructure ownership
Engineering Operations leadership
Strategic business partnership
Organizational excellence and execution rigor

This role goes far beyond traditional DevOps or Platform leadership. You will be the right hand to the Head of Engineering, owning day-to-day operational excellence while enabling product velocity, AI scalability, and enterprise-grade reliability.

Role Scope & Accountability

Area
Platform & Infrastructure
Engineering Operations
Strategic Business Partner
Organizational Excellence

Leadership & Collaboration Expectations (Non-Negotiable)

Resolve disagreements privately, present aligned positions publicly
No unaligned executive escalations
Transparent risk communication with aligned mitigation plans
Platform team must be viewed as an enabler, not a blocker
Influence through trust, data, and partnership, not authority

Key Responsibilities

I. Platform & Infrastructure Leadership (40%)

Cloud, Architecture & Scalability

Own AWS infrastructure strategy (EKS, RDS, VPC, IAM, networking)
Define multi-tenant SaaS patterns (shared DB + RLS, silo for enterprise)
Scale platform from 10 → 100 → 500+ customers
Drive vendor evaluation and build vs. buy decisions
Ensure reliability, performance, security, and cost efficiency

AI / ML Infrastructure & MLOps (Critical)

Self-Hosted LLM Infrastructure

Deploy and operate self-hosted SLMs for privacy and cost efficiency
GPU infrastructure (AWS P4/G5, 8–16 GPUs)
Model serving: vLLM, TGI, Ray Serve
Fine-tuning pipelines (LoRA, QLoRA)
Quantization (4-bit / 8-bit) and autoscaling

Model Deployment & APIs

Deploy predictive ML models (forecasting, classification, anomaly detection)
Real-time inference (<100ms p95) and batch pipelines
CI/CD for models with canary and blue-green deployments
Drift detection, accuracy tracking, rollback

AI Cost Management & Pricing Enablement

Token and GPU cost tracking per tenant and feature
Unit economics for AI workloads
API vs self-hosted break-even modeling
Prompt caching, response caching, batching strategies (40–60% savings)

AI Observability & SLOs

LLM latency (p50/p95/p99), success rates, token usage
Agent performance (completion rate, tool success, latency)
RAG quality metrics and retrieval accuracy
Cost anomaly detection and alerting

DevOps, Reliability & SRE

Build and scale DevOps/SRE team (3–4 → 8–10)
CI/CD with <10 min deploys, GitOps (ArgoCD / Flux)
Define SLAs/SLOs (99.9% uptime target)
Incident response, blameless postmortems, MTTR/MTTD tracking
Disaster recovery and business continuity planning

Security & Compliance

Own SOC 2 Type I & II, GDPR, HIPAA readiness
Zero Trust security architecture
Vulnerability management and pen testing
Security team hiring and security champions program
Incident response and forensics

Cloud Cost Optimization (FinOps)

Own AWS + AI budget ($50K → $500K+/month)
Reserved instances, spot strategies, right-sizing
Cost allocation by tenant and team
Target: 20% YoY cost reduction

II. Engineering Operations Leadership (30%)

Talent & People Operations

Hiring strategy for 33–44 engineers in Year 1
Build offshore development centers (India / Eastern Europe)
Own performance reviews, promotions, PIPs, exits
Define career ladders, leveling, and compensation bands
Coach managers and directors

Engineering Productivity & Tools

Own dev tooling: GitHub, CI/CD, Jira/Linear, Notion, Datadog
Track DORA metrics, cycle time, developer NPS
Reduce friction via automation and internal platforms
Vendor management and SaaS consolidation

Process & Execution Excellence

Agile ceremonies, RFCs, architecture reviews
Release management and dependency coordination
Executive dashboards and KPI reporting
Conflict resolution via private alignment and consensus

III. Strategic Business Partner (20%)

Platform & AI Pricing Strategy

Define SaaS + AI pricing tiers (Starter / Pro / Enterprise)
Usage-based AI pricing (queries, tokens, agents)
Gross margin modeling (>70% infra, >60% AI)
Cost-to-serve and break-even analysis

Financial Planning & Advisory

Engineering budget ownership ($4.7M–$6M)
Headcount and infrastructure forecasting
ROI analysis for infrastructure investments
Vendor negotiation (AWS, Datadog, Auth0, LLM providers)

Strategic Leadership

Identify blindspots proactively
Quarterly and annual planning partner to Head of Engineering
Support Sales on enterprise deals and security reviews
Board and investor-facing technical leadership

IV. Organizational Excellence (10%)

Define and reinforce engineering culture and values
Knowledge management, documentation, onboarding playbooks
Executive communication and board-level reporting
High-trust, high-performance environment

Required Qualifications

Technical

12+ years engineering experience, 6+ years leadership
AWS at scale (EKS, RDS, VPC, IAM)
Kubernetes, Terraform, CI/CD, DevSecOps
Required: Self-hosted LLMs, GPU infra, MLOps in production
AI cost optimization and observability experience
Security and compliance leadership (SOC 2, GDPR, HIPAA)

Operational & Business

Led 30–50+ person engineering orgs
Hiring, performance management, and org design
$5M+ engineering budgets
SaaS unit economics and pricing strategy
Executive-level communication and diplomacy

Preferred Qualifications

VP Engineering experience at Series A/B/C startup
Large-scale AI/GPU deployments (100+ GPUs)
Fintech or regulated domain experience
Offshore center build-out experience
MBA or executive leadership training

Skills

AWSAWS EKSAWS G5AWS P4Auth0ArgoCDCI/CDDatadogDevSecOpsDockerGitOpsGitHubHIPAAJiraKubernetesLinearLoRALLMMLOpsNotionQLoRARay ServeSaaSSOC 2TGITerraformVPCvLLM

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free