Skip to content
mimi

Senior AI Engineer

Jobs via Dice

Sunrise · On-site Full-time Senior 2w ago

About the role

Role Summary

We are looking for a Senior AI Engineer to design, build, and operate production-grade Agentic AI systems that power intelligent assistants and autonomous workflows across regulated enterprise environments.

This role will focus on LLM-based agents, multi-agent orchestration, guardrails, and end-to-end deployment pipelines, with direct exposure to Copilot Studio, LangChain, LangGraph, and enterprise CI/CD practices.

The ideal candidate combines strong Python engineering, agent architecture expertise, and rigorous testing/observability discipline to ensure safe, reliable, and accurate AI responses at scale.

Key Responsibilities

Agentic AI & LLM Engineering

  • Design and implement Agentic AI systems using LangChain and LangGraph, including planner executor, router, and evaluator patterns.
  • Build multi-agent workflows for intent classification, probing, reasoning, and decision orchestration.
  • Develop tool-using agents (retrieval, rules engines, APIs, enterprise services).
  • Optimize prompt strategies, state management, memory, and reasoning flows to minimize hallucinations and maximize accuracy.

Copilot & Agent Platforms

  • Build and extend agents using Copilot Studio and Power Platform based agent frameworks.
  • Integrate custom Python-based agents with Copilot runtime, connectors, and enterprise data sources.
  • Collaborate with product and platform teams to operationalize agents across real business workflows.

Guardrails, Safety & Compliance

  • Implement AI guardrails including:
    • Policy enforcement
    • Output validation
    • Grounding checks (RAG / knowledge-based verification)
    • Human-in-the-loop and escalation patterns
  • Ensure agents comply with enterprise risk, regulatory, and data security standards.
  • Design architectures that are auditable, observable, and deterministic where required.

Testing, Evaluation & Quality Assurance

  • Build automated agent testing frameworks to validate:
    • Correct intent classification
    • Accurate probing behavior
    • Expected response generation
    • Regression prevention across prompt and model changes
  • Implement offline and online evaluation (golden datasets, synthetic tests, confidence scoring).
  • Partner with QA and Ops to monitor accuracy, failure modes, and drift.

CI/CD & Platform Engineering

  • Develop CI/CD pipelines for AI agents (prompt versioning, agent configs, model updates).
  • Support containerized deployments and environment promotion (dev test prod).
  • Integrate logging, observability, alerts, and performance metrics for agent behavior.

Required Qualifications

Technical Skills (Must Have)

  • Strong Python engineering experience (async, APIs, services).
  • Hands-on experience with LangChain and/or LangGraph in real-world agent implementations.
  • Experience building Agentic AI systems (not just prompts or chatbots).
  • Understanding of LLM tooling, RAG, function/tool calling, and orchestration patterns.
  • Experience implementing CI/CD pipelines for ML or AI-driven systems.
  • Proven experience in testing LLM outputs and agent behavior.

Platform & Architecture

  • Experience with enterprise AI platforms (Copilot Studio, Power Platform, or equivalent).
  • Familiarity with microservices, APIs, event-driven systems, and cloud-native design.
  • Experience designing governed, production-ready AI architectures.

Preferred / Nice To Have

  • Experience with Copilot Studio custom agents or connectors.
  • Knowledge of LLMOps / AI Ops practices.
  • Experience in regulated domains (financial services, healthcare, compliance-heavy environments).
  • Familiarity with evaluation frameworks, agent observability tools, and policy engines.
  • Exposure to graph-based reasoning or knowledge graphs.

What Success Looks Like

  • Agents consistently generate accurate, policy-compliant, and explainable responses.
  • New agent capabilities move from prototype to production safely and quickly.
  • CI/CD pipelines catch regressions before agents reach users.
  • Guardrails prevent hallucinations and incorrect guidance at scale.
  • AI systems are trusted by both business users and risk/compliance teams.

Skills

APIAsyncCI/CDCopilot StudioDockerLangChainLangGraphLLMMLMicroservicesObservabilityPower PlatformPythonRAG

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free