Skip to content
mimi

Senior Voice AI Engineer โ€” Multi-Agent & Role-Based Dialogue

Upwork

Remote (Global) Contract Senior Today

About the role

About

We are looking for developers who have worked across modern LLM and Voice AI Agents ecosystems including OpenAI, Google (Gemini), and related tooling.

Responsibilities

  • Design and build multi-agent voice conversational systems
  • Implement role-based dialogue logic (distinct behaviors, goals, and constraints per agent)
  • Integrate LLMs (OpenAI, Gemini, or similar) into real-time systems
  • Develop low-latency voice interaction pipelines (STT โ†’ LLM โ†’ TTS)
  • Handle:
    • Interruptions
    • Turn-taking
    • Context management across multiple agents
  • Optimize streaming and response times
  • Contribute to prompt/system design for structured outputs

Requirements

  • Proven experience building AI voice agents or real-time conversational voice systems
  • Experience with at least one major LLM platform having worked with real-time APIs:
    • OpenAI
    • Google Gemini / GCP
  • Strong understanding of:
    • Multi-agent systems
    • Conversation flow design
    • Prompt engineering
    • Real-time architectures (WebSockets, streaming)

Requirements

  • Proven experience building AI voice agents or real-time conversational voice systems
  • Experience with at least one major LLM platform having worked with real-time api's: OpenAI
  • Experience with at least one major LLM platform having worked with real-time api's: Google Gemini / GCP
  • Strong understanding of multi-agent systems
  • Strong understanding of conversation flow design
  • Strong understanding of prompt engineering
  • Strong understanding of real-time architectures (WebSockets, streaming)

Responsibilities

  • Design and build multi-agent voice conversational systems
  • Implement role-based dialogue logic (distinct behaviors, goals, and constraints per agent)
  • Integrate LLMs (OpenAI, Gemini, or similar) into real-time systems
  • Develop low-latency voice interaction pipelines (STT โ†’ LLM โ†’ TTS)
  • Handle interruptions
  • Handle turn-taking
  • Handle context management across multiple agents
  • Optimize streaming and response times
  • Contribute to prompt/system design for structured outputs

Skills

GeminiGoogle Cloud PlatformLLMOpenAISTTStreamingTTSWebSockets

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free