Skip to content
mimi

Founding AI Research Engineer / Scientist — Institutional World Models, LLMs, Graph ML, MLOps & AI Safety

HYLMAN

Remote · France Contract Lead 1mo ago

About the role

About this role

HYLMAN is forming the founding technical team for a new frontier-AI lab initiative focused on verified world models for governed institutions.

We are not building another chatbot, RAG wrapper, workflow bot, or generic agent toolchain.

We are working on a new class of AI systems that can model how real organizations change when actions are taken. In enterprise, public-sector, healthcare, financial, industrial, and regulated environments, an action is not successful just because an API call returns 200 OK. It is successful only if the right institutional state changed, the right approvals existed, the right evidence was present, downstream obligations were closed, hidden side effects were acceptable, and policy constraints were satisfied.

The core research direction is action-conditioned institutional world modeling:

What you may work on

You may contribute to one or more of the following:

  • Building institutional state graphs from enterprise events, workflow logs, documents, policies, tickets, CRM records, ITSM records, approvals, evidence, obligations, controls, and system snapshots.
  • Designing receipt-backed transition datasets that bind pre-state, actor, authority, action, evidence, post-state diff, policy result, observability mask, and delayed outcome.
  • Training and evaluating action-conditioned transition models that predict structured state changes, hidden side effects, policy violations, evidence gaps, unresolved obligations, delayed risk, and uncertainty.
  • Developing graph/sequence/foundation-model architectures using long-context transformers, state-space models, graph neural networks, graph transformers, structured prediction, multi-head objectives, and policy-conditioned model heads.
  • Building ReceiptBench-style benchmarks, synthetic institutional worlds, public benchmark converters, ablation harnesses, model cards, eval dashboards, calibration tests, and failure taxonomies.
  • Designing model-predictive control for AI agents, where candidate actions are simulated and rejected before execution if they violate policy, lack evidence, exceed authority, or create unacceptable downstream risk.
  • Building secure MLOps/LLMOps/model-systems infrastructure: GPU training/inference pipelines, run registry, dataset versioning, reproducible evaluations, model serving, observability, FinOps, CI/CD, Kubernetes, Ray, MLflow/W&B, vLLM/Triton, and cloud/HPC deployment.
  • Developing formal policy and verification layers using Datalog/Rego-style rules, policy compilers, proof traces, RBAC/ABAC/IAM, audit contracts, schema validation, constrained decoding, and runtime guardrails.
  • Designing security, privacy, and governance architecture for AI systems operating over sensitive institutional data: GDPR, EU AI Act readiness, ISO 27001-style controls, customer-isolated partitions, access control, auditability, data minimization, secure agent/tool execution, and red-team evaluation.

We are especially interested in people with experience in

Foundation models, LLMs, generative AI, transformer architectures, state-space models, long-context modeling, structured prediction, graph ML, GNNs, graph transformers, knowledge graphs, entity/relation extraction, NLP, information extraction, semantic search, RAG, retrieval-augmented generation, agentic AI, tool-using agents, multi-agent systems, LangGraph, LlamaIndex, LangChain, MCP, function calling, structured outputs, JSON-schema validation, policy-conditioned generation, constrained decoding, RLHF, RLVR, verifiable reward modeling, reinforcement learning, model-predictive control, uncertainty estimation, calibration, causal inference, delayed-outcome modeling, process mining, object-centric event logs, OCEL, OCPM, BPMN, Celonis, ServiceNow, ITSM, CRM, enterprise workflow systems, knowledge graphs, Neo4j, RDF, SPARQL, Cypher, Datalog, Rego, OPA, formal verification, policy engines, MLOps, LLMOps, MLflow, Weights & Biases, Kubernetes, Docker, Terraform, Ray, vLLM, NVIDIA Triton, DeepSpeed, FSDP, Accelerate, PyTorch, JAX, Hugging Face, H100/A100 GPU infrastructure, secure cloud architecture, AI security, privacy engineering, audit logging, GDPR, EU AI Act, ISO 27001, SOC2, OWASP, and regulated enterprise AI.

Ideal backgrounds

You may be a strong fit if you are one of the following:

  • A machine learning researcher or research engineer with hands-on experience in LLMs, graph ML, structured prediction, uncertainty, world models, RL, or model evaluation.
  • A knowledge graph / process mining / workflow intelligence expert who understands enterprise event logs, object-centric process data, BPMN, ITSM, CRM, ServiceNow, Celonis, Neo4j, RDF, SPARQL, Cypher, or policy-aware workflow systems.
  • A model systems / GPU infrastructure engineer who can build reliable training, inference, evaluation, observability, and cost-control infrastructure for foundation-model experimentation.
  • A formal methods / policy / verification engineer who can turn authority, evidence, approval, access-control, and compliance rules into executable policy programs and proof traces.
  • A security / privacy / AI governance architect who can make sensitive institutional AI systems auditable, isolated, secure, and regulator-ready.
  • A regulated enterprise AI builder who has shipped AI systems in healthcare, finance, insurance, public sector, industrial operations, enterprise SaaS, cybersecurity, or other high-accountability environments.

Required qualities

You should be able to operate in a founding environment: high ambiguity, high ownership, low bureaucracy, fast proof cycles, and direct technical accountability.

You should be able to show evidence of your work. We will ask for proof: GitHub, papers, patents, Google Scholar, Hugging Face, demos, architecture diagrams, model cards, dashboards, sanitized code, benchmark artifacts, references, production systems, awards, or other concrete validation.

What we need from applicants

Please apply by sending our team at ryan.adams@hylman.com:

  • Your CV.
  • A short note explaining which track fits you best: ML research / graph & process data / model systems & GPU infra / policy & verification / security & governance / regulated AI pilots.
  • Links to proof of work: GitHub, papers, portfolio, Hugging Face, Google Scholar, patents, talks, architecture diagrams, demos, benchmark results, public products, or references.
  • Your availability from July onward, including expected FTE or hours per week.
  • Your current location and ability to work on CET-compatible calls.
  • Any employer, IP, grant, consulting, visa, confidentiality, or conflict-of-interest restrictions that could affect participation.

AI Research Engineer, Machine Learning Scientist, Founding Engineer, Foundation Models, LLMs, Generative AI, World Models, Institutional World Models, Action-Conditioned Models, Graph Machine Learning, Knowledge Graphs, Process Mining, Object-Centric Event Logs, OCEL, BPMN, ServiceNow, ITSM, CRM, State-Diff Prediction, Structured Prediction, Agentic AI, AI Agents, Tool-Using Agents, LangGraph, LlamaIndex, LangChain, MCP, RAG, Retrieval-Augmented Generation, MLOps, LLMOps, GPU Infrastructure, H100, A100, PyTorch, JAX, Hugging Face, DeepSpeed, FSDP, vLLM, NVIDIA Triton, Kubernetes, MLflow, Weights & Biases, Formal Verification, Policy Engines, Datalog, Rego, OPA, RBAC, ABAC, AI Safety, AI Security, Privacy Engineering, GDPR, EU AI Act, Regulated AI, Enterprise AI, Model Evaluation, Benchmarking, Calibration, Causal Inference, Delayed Risk, Reinforcement Learning, Model Predictive Control.

Skills

ABACAI SecurityAWS LambdaCelonisCRMDatalogDeepSpeedDockerEntity/relation extractionEU AI ActFSDPGDPRGNNsGraph MLGraph transformersH100/A100 GPU infrastructureHugging FaceIAMInformation extractionISO 27001ITSMJAXJSON-schema validationKubernetesLangChainLangGraphLlamaIndexLLMLLMOpsMLOpsMCPNeo4jNVIDIA TritonNLPObject-centric event logsOCELOPAOWASPPolicy enginesProcess miningPyTorchRAGRBACRegoReinforcement learningRetrieval-augmented generationRDFRLHFRLVRServiceNowSemantic searchSPARQLState-space modelsStructured outputsTerraformTransformersvLLMWeights & BiasesWorld models

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free