All jobs · Machine Learning Engineer jobs

Senior AI Engineer

Jobs via Dice

Sunrise · On-site Full-time Senior 2mo ago

About the role

Role Summary

We are looking for a Senior AI Engineer to design, build, and operate production-grade Agentic AI systems that power intelligent assistants and autonomous workflows across regulated enterprise environments.

This role will focus on LLM-based agents, multi-agent orchestration, guardrails, and end-to-end deployment pipelines, with direct exposure to Copilot Studio, LangChain, LangGraph, and enterprise CI/CD practices.

The ideal candidate combines strong Python engineering, agent architecture expertise, and rigorous testing/observability discipline to ensure safe, reliable, and accurate AI responses at scale.

Key Responsibilities

Agentic AI & LLM Engineering

Design and implement Agentic AI systems using LangChain and LangGraph, including planner executor, router, and evaluator patterns.
Build multi-agent workflows for intent classification, probing, reasoning, and decision orchestration.
Develop tool-using agents (retrieval, rules engines, APIs, enterprise services).
Optimize prompt strategies, state management, memory, and reasoning flows to minimize hallucinations and maximize accuracy.

Copilot & Agent Platforms

Build and extend agents using Copilot Studio and Power Platform based agent frameworks.
Integrate custom Python-based agents with Copilot runtime, connectors, and enterprise data sources.
Collaborate with product and platform teams to operationalize agents across real business workflows.

Guardrails, Safety & Compliance

Implement AI guardrails including:
- Policy enforcement
- Output validation
- Grounding checks (RAG / knowledge-based verification)
- Human-in-the-loop and escalation patterns
Ensure agents comply with enterprise risk, regulatory, and data security standards.
Design architectures that are auditable, observable, and deterministic where required.

Testing, Evaluation & Quality Assurance

Build automated agent testing frameworks to validate:
- Correct intent classification
- Accurate probing behavior
- Expected response generation
- Regression prevention across prompt and model changes
Implement offline and online evaluation (golden datasets, synthetic tests, confidence scoring).
Partner with QA and Ops to monitor accuracy, failure modes, and drift.

CI/CD & Platform Engineering

Develop CI/CD pipelines for AI agents (prompt versioning, agent configs, model updates).
Support containerized deployments and environment promotion (dev test prod).
Integrate logging, observability, alerts, and performance metrics for agent behavior.

Required Qualifications

Technical Skills (Must Have)

Strong Python engineering experience (async, APIs, services).
Hands-on experience with LangChain and/or LangGraph in real-world agent implementations.
Experience building Agentic AI systems (not just prompts or chatbots).
Understanding of LLM tooling, RAG, function/tool calling, and orchestration patterns.
Experience implementing CI/CD pipelines for ML or AI-driven systems.
Proven experience in testing LLM outputs and agent behavior.

Platform & Architecture

Experience with enterprise AI platforms (Copilot Studio, Power Platform, or equivalent).
Familiarity with microservices, APIs, event-driven systems, and cloud-native design.
Experience designing governed, production-ready AI architectures.

Preferred / Nice To Have

Experience with Copilot Studio custom agents or connectors.
Knowledge of LLMOps / AI Ops practices.
Experience in regulated domains (financial services, healthcare, compliance-heavy environments).
Familiarity with evaluation frameworks, agent observability tools, and policy engines.
Exposure to graph-based reasoning or knowledge graphs.

What Success Looks Like

Agents consistently generate accurate, policy-compliant, and explainable responses.
New agent capabilities move from prototype to production safely and quickly.
CI/CD pipelines catch regressions before agents reach users.
Guardrails prevent hallucinations and incorrect guidance at scale.
AI systems are trusted by both business users and risk/compliance teams.

Skills

APIAsyncCI/CDCopilot StudioDockerLangChainLangGraphLLMMLMicroservicesObservabilityPower PlatformPythonRAG

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free