Senior Software Engineer - Machine Learning

Caffeine

On-site Senior 3w ago

About the role

About Caffeine.ai

Caffeine.ai is building the platform for self-writing apps — where natural language produces full-stack, production-ready applications deployed to the Internet Computer, an open sovereign cloud. Our mission is to make building software as simple as a conversation: ideas become live systems in minutes, with no code required.

What sets Caffeine apart is the infrastructure beneath it. While other self-writing platforms build on traditional stacks, Caffeine runs on a different foundation — one where apps are tamperproof by design, data is guaranteed safe on every update, and backend code is written in Motoko, a language built specifically for AI code generation. This is a platform built for real production software, not just prototypes.

We are a cross-functional team of engineers and researchers building the AI that powers this new paradigm.

About the Role

As a Senior Software Engineer — Machine Learning, you will own the layer between our agentic core and everything the user sees and touches. That means multi-agent orchestration, real-time streaming pipelines, and the persistence layer that holds the state of applications that were never manually written. This is not prompt engineering — it's the industrial‑grade plumbing underneath it.

What You'll Do

Own real-time streaming infrastructure: Build and operate the SSE pipeline that delivers agentic job state from backend to client — designing for latency, reliability, and graceful failure at every step.
Build the job orchestration layer: Coordinate multi-agent workflows end‑to‑end — dispatch, retries, state recovery, and context continuity across long‑running, non‑deterministic workloads.
Design schemas and persistence strategies: Own the database layer for agentic work — jobs, artifacts, agent memory, and the user's evolving application state.
Bridge agent output to product: Transform raw agent output into the structured data models the frontend and other services depend on.
Instrument the full pipeline: Measure latency, throughput, and failure surfaces — and stay close to production behaviour across every release.

Who You Are

Streaming systems experience: You've designed real-time streaming systems in production — SSE, event‑driven architectures, or similar — and you know where they fail under load.
Database‑as‑design‑surface thinker: You think about databases not just as storage but as a design surface — schema decisions, consistency guarantees, and state lifecycle are things you get opinionated about.
Agentic/LLM pipeline experience: You've worked with agentic or LLM pipelines in a backend context and understand the operational challenges of long‑running, non‑deterministic workloads.
Product‑aware infrastructure mindset: You care about the user‑facing effect of your infrastructure choices — latency, dropped events, stale state are product problems as much as engineering ones.
High autonomy: Ambiguity doesn't stall you — you scope the surface, make a call, and ship something you can measure.
Small‑team energy: You're energised by small teams where your work reaches real users within days, not quarters.

Bonus Points

Experience with TypeScript backend frameworks (Node.js, NestJS, Fastify).
Familiarity with multi‑agent architectures or AI orchestration systems.
Experience with event‑driven architectures and message queues.
Knowledge of DevOps (Docker, Kubernetes, Observability).
Interest in Web3 or sovereign cloud infrastructure.
This is an on‑site role. We work together in person, every day — it's core to how we build. We don't offer remote or hybrid arrangements.

Requirements

You've designed real-time streaming systems in production — SSE, event-driven architectures, or similar — and you know where they fail under load.
You think about databases not just as storage but as a design surface — schema decisions, consistency guarantees, and state lifecycle are things you get opinionated about.
You've worked with agentic or LLM pipelines in a backend context and understand the operational challenges of long-running, non-deterministic workloads.
You care about the user-facing effect of your infrastructure choices — latency, dropped events, stale state are product problems as much as engineering ones.
Ambiguity doesn't stall you — you scope the surface, make a call, and ship something you can measure.
You're energised by small teams where your work reaches real users within days, not quarters.

Responsibilities

Own real-time streaming infrastructure: Build and operate the SSE pipeline that delivers agentic job state from backend to client — designing for latency, reliability, and graceful failure at every step.
Build the job orchestration layer: Coordinate multi-agent workflows end-to-end — dispatch, retries, state recovery, and context continuity across long-running, non-deterministic workloads.
Design schemas and persistence strategies: Own the database layer for agentic work — jobs, artifacts, agent memory, and the user's evolving application state.
Bridge agent output to product: Transform raw agent output into the structured data models the frontend and other services depend on.
Instrument the full pipeline: Measure latency, throughput, and failure surfaces — and stay close to production behaviour across every release.

Skills

DockerKubernetesLLMMotokoNode.jsNestJSFastifySSETypeScriptWeb3

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Senior Software Engineer - Machine Learning

About the role

About Caffeine.ai

About the Role

What You'll Do

Who You Are

Bonus Points

Requirements

Responsibilities

Skills

Similar roles

Sr.Software Engineer - Bizagi Job

Java Fullstack Developer

Développeur Paiement (Pay-in / PSP / Acquiring)

Don't send a generic resume