Skip to content
mimi

Principal, AI platform and MLOps architect

SRM Digital LLC

Philadelphia · On-site Contract Lead 4d ago

About the role

Minimum Experience

  • 20+ years

Job Description

  • Seeking a senior AI/ML platform leader to design an operationalize a scalable production ready AI/ML architecture across multiple business units.
  • This role is responsible for moving the organization from proof of concept AI efforts to repeatable governed delivery of models in production.
  • The ideal candidate has built and deployed machine learning systems at scale, understands both ML development, and platform engineering, and can define the environments pipelines and architectural standards that enable team to safely and efficiently ship AI.
  • This is a hands‑on architectural leadership role with significant influence across product, engineering, data and security teams.

Key Responsibilities

  • Include AIML architecture and platform design define the end to end AI/ML reference architecture from Data injection through model serving and monitoring. Establish standards for data storage, access patterns and lineage, including separation of raw, curated and feature ready data. Assess and define the need for shared platform capabilities such as
    • feature stores
    • model registries
    • AIML catalogues
    • experiment tracking
    • design for scale across multiple business units with differing data sensitivity, regulatory and operational needs
  • Environment and delivery pipeline: Define standard development, validation and production environments for AIML workloads. Designer a repeatable ML delivery pipeline covering
    • model development and training,
    • validation, approval and promotion,
    • deployment (batch and/or real time)
    • monitoring drift detection and retraining
    • establish CI/CD (and continuous training where appropriate) best practices for ML systems
  • MLOps governance and production readiness: Define what production read means for AIML models, including:
    • testing and validation requirements,
    • monitoring and alerting
    • rollback and incident response patterns,
    • partner with security, legal and compliance team to integrate governance without slowing delivery
    • ensure models are discoverable auditable and traceable across environment and business units
  • Enablement and operating model: Create a paved road for AI development
    • shared standards, templates and tooling that business unit can serve against
    • advise and enable product teams and engineering teams moving models from POC to production
    • help define the long‑term operating model for AI/ML ownership across central platform teams and federated BU teams

Required Qualifications

  • experience in software engineering, data platforms or ML engineering
  • years of hands‑on experience deploying machine learning systems into production
  • proving experience in designing AI/ML platform or MLOps architectures (not just individual models)
  • strong understanding of:
    • ML life cycle management data,
    • Data pipelines and feature engineering
    • model serving patterns (batch, real time, API’s )
  • experience working across organizational boundaries (multiple business teams and units)
  • Ability to communicate architectural designs and decisions clearly to both technical and non‑technical stakeholders

Preferred Qualifications

  • Experience designing or operating
    • features stores
    • model registries and experiment tracking platforms,
    • AI governance and risk framework
  • familiarity with cloud, native ML platforms and infrastructure (AWS, GCP, Azure or similar)
  • Experience monitoring teams and establishing standards at scale
  • background in regulated or data sensitive environment

Success Metrics (12 to 18 months)

  • A documented adopted AIML reference architecture used cross business units
  • Clear standardize environment and pipelines enabling faster promotion from POC to production
  • Reduced the duplication of feature engine engineering, and model deployment efforts
  • consistent visibility into which models are running in production and how they perform

Skills

AICI/CDGCPMLOpsAWSAzure

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free