Skip to content
mimi

Data Engineer / AI Engineer (Agentic AI Platform – Financial Data)

Jobs via Dice

Philadelphia · Hybrid Contract Today

About the role

About the Role:

We are building a platform that converts unstructured financial data (emails, corporate actions, index announcements) into high-quality, structured datasets used by financial institutions.

This is not a typical “LLM wrapper” role.

You will work on systems that:

  • Extract data from noisy, inconsistent sources
  • Validate and reconcile outputs across multiple inputs
  • Ensure correctness, traceability, and auditability

The challenge is not just applying LLMs—it’s making them reliable in production for financial workflows.

What You’ll Work On

  • Designing pipelines that process high-volume financial documents (batch + near real-time)
  • Building LLM-powered extraction workflows (classification, parsing, summarization)
  • Implementing validation layers (rule-based + model-based) to reduce hallucinations
  • Developing retrieval systems using embeddings and vector search
  • Architecting end-to-end systems: ingestion → processing → storage → serving
  • Ensuring data quality, observability, and fault tolerance
  • Collaborating with product to turn messy data into usable financial intelligence

Core Requirements

  • Strong Python and backend/data engineering experience
  • Experience building production data pipelines (ETL, streaming, or async systems)
  • Solid understanding of distributed systems and failure modes
  • Experience working with LLM-based systems in production:
    • Prompt design
    • Output validation
    • Retry/fallback strategies
    • Evaluation and monitoring
  • Experience with data storage systems (SQL + NoSQL)
  • Familiarity with cloud infrastructure (AWS or similar)

Preferred Experience

  • Experience with RAG / vector search systems
  • Background in financial data or capital markets
  • Experience with streaming systems (Kafka, etc.)
  • Experience building multi-step or agent-style workflows

What Makes This Role Interesting

  • Work on high-accuracy AI systems where correctness matters
  • Solve real problems around:
    • LLM reliability and hallucination mitigation
    • Data consistency across conflicting sources
    • Real-time vs correctness tradeoffs
  • Build systems used in financial decision-making workflows
  • High ownership over core architecture in an early-stage environment

Nice To Know (but Not Required)

  • Experience with orchestration tools (Airflow, etc.)
  • Exposure to evaluation frameworks for LLMs
  • Experience working with large-scale document processing

Tech Stack (Representative, not exhaustive)

  • Python, APIs, async processing
  • LLM APIs + embeddings
  • SQL / NoSQL databases
  • Cloud infrastructure (AWS)
  • Data pipelines and streaming systems
  • Vector Databases

Skills

AWSAPIsAsync processingCloud infrastructureData pipelinesEmbeddingsETLKafkaLLMNoSQLObservabilityPythonRAGSQLStreaming systemsVector databasesVector search

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free