Skip to content
mimi

Director, ML/Dev Ops (Tip.AI)

Marriott International, Inc

Bethesda · flexible Full-time Executive 4d ago

About the role

Position Overview

Lead the transformation of how applications and AI systems are delivered, operated, and evolved at enterprise scale. This role owns the design and execution of AI-powered Dev Ops in the Marriott's AI Platform. The goal is to enabling teams to ship production-grade, observable, self-healing services with minimal human toil. You will partner deeply with the Kubernetes platform team, devops platform team and other organizational leaders to help produce safe scalable solutions that protect the core AI platform services so we can provide high business value interactions to the org.

Key Responsibilities

  • Build resilient CI/CD pipelines for platform services that include testing, monitoring and auto-rollback.
  • Deploy models and workloads via Kubernetes + Sage Maker, KFServing, Ray Serve, etc.; sustain latency/error budgets.
  • Work with other platform teams to advance their innovation roadmaps as an early adopter.
  • Embed Open Telemetry traces, vector-metrics, cost monitors into unified dashboards.
  • Implement MCP-compliant gateways for safe human-and-agent invocations.
  • Champion the use of internal autonomous agents to eliminate repetitive Dev Ops and SRE toil across build, deploy, and runtime operations
  • Serve as a thought leader for AI-based operations, influencing architecture standards, platform roadmaps, and engineering culture.
  • Coach senior engineers and platform teams on modern Dev Ops, SRE, and AI-Ops patterns.
  • Delivery and reliability of the platform: Lead post-incident learning and drive systemic improvements through blameless retrospectives and automation.

Qualifications

  • Extensive experience working on highly scalable and available systems as a software engineering experience,
  • Deep knowledge of standard devOps practices and cloud infrastructure. This includes identity management and networking.
  • Experience in ML Ops working with live models.
  • IaC mastery (CDK/Terraform) and secrets management (Vault, AWS Secrets Manager).
  • Proven record hitting SLOs for containerized ML services at fleet scale.
  • Deep Experience working with cloud
  • Strong servant-leader with a passion for work-automation and incident retros.
  • Extreme desire to be part of a committed team that is building for global scale to change the way the world does travel.
  • Excellent verbal communication skills, with the ability to articulate complex architectural decisions clearly.
  • Ability to produce/review extremely clean software documentation
  • Ability to effectively communicate async with remote team members across the globe.

Preferred Skills

  • Experience moving legacy CI to agent-augmented pipelines.
  • Cost-aware autoscaling and GPU quota governance.
  • Experience building with Harness.io
  • Certifications in AWS or GCP

Why This role Matters

This role exists to set the bar for how software systems are delivered and operated at enterprise scale, moving from manual Dev Ops to AI-driven, self-healing platforms. You will embed intelligence into CI/CD pipelines and Kubernetes runtimes so teams can ship faster, safer, and with far less operational toil. Working closely with platform and Kubernetes teams, you'll introduce AI-based improvements that materially raise reliability, scalability, and efficiency.

If you want to lead the shift from reactive operations to systems that learn, adapt, and run themselves, this role gives you the scope and influence to do it.

At Marriott International, we are dedicated to being an equal opportunity employer, welcoming all and providing access to opportunity. We actively foster an environment where the unique backgrounds of our associates are valued and celebrated. Our greatest strength lies in the rich blend of culture, talent, and experiences of our associates. We are committed to non-discrimination on any protected basis, including disability, veteran status, or other basis protected by applicable law.

About Us

All positions offer a 401(k) plan, stock purchase plan, discounts at Marriott properties, commuter benefits, employee assistance plan, and childcare discounts. Benefits are subject to terms and conditions, which may include rules regarding eligibility, enrollment,…

Skills

AWS Secrets ManagerCDKCI/CDCloud InfrastructureDockerHarness.ioIaCKubernetesML OpsOpen TelemetryRay ServeSageMakerTerraformVault

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free