Skip to content
mimi

Senior Data Engineer / AI Engineer (Agentic AI Platform Financial Data)

Jobs via Dice

Philadelphia · Hybrid Contract Senior Today

About the role

About Us

Dice is the leading career destination for tech experts at every stage of their careers. Our client, SANS, is seeking the following.

About the Role

We are building a platform that converts unstructured financial data ( emails, corporate actions, index announcements ) into high-quality, structured datasets used by financial institutions.

This is not a typical LLM wrapper role.

You will work on systems that:

  • Extract data from noisy, inconsistent sources
  • Validate and reconcile outputs across multiple inputs
  • Ensure correctness, traceability, and auditability

The challenge is not just applying LLMs it s making them reliable in production for financial workflows.

What You Ll Work On

  • Designing pipelines that process high-volume financial documents (batch + near real-time)
  • Building LLM-powered extraction workflows ( classification, parsing, summarization )
  • Implementing validation layers (rule-based + model-based) to reduce hallucinations
  • Developing retrieval systems using embeddings and vector search
  • Architecting end-to-end systems: ingestion processing storage serving
  • Ensuring data quality, observability, and fault tolerance
  • Collaborating with product to turn messy data into usable financial intelligence

Core Requirements

  • Strong Python and backend/data engineering experience
  • Experience building production data pipelines (ETL, streaming, or async systems)
  • Solid understanding of distributed systems and failure modes
  • Experience working with LLM-based systems in production:
    • Prompt design
    • Output validation
    • Retry/fallback strategies
    • Evaluation and monitoring
  • Experience with data storage systems (SQL + NoSQL)
  • Familiarity with cloud infrastructure (AWS or similar)

Preferred Experience

  • Experience with RAG / vector search systems
  • Background in financial data or capital markets
  • Experience with streaming systems (Kafka, etc.)
  • Experience building multi-step or agent-style workflows

What Makes This Role Interesting

  • Work on high-accuracy AI systems where correctness matters
  • Solve real problems around:
    • LLM reliability and hallucination mitigation
    • Data consistency across conflicting sources
    • Real-time vs correctness tradeoffs
  • Build systems used in financial decision-making workflows
  • High ownership over core architecture in an early-stage environment

Nice To Know (but Not Required)

  • Experience with orchestration tools ( Airflow, etc.)
  • Exposure to evaluation frameworks for LLMs
  • Experience working with large-scale document processing

Tech Stack (Representative, not exhaustive)

  • Python, APIs, async processing
  • LLM APIs + embeddings
  • SQL / NoSQL databases
  • Cloud infrastructure (AWS)
  • Data pipelines and streaming systems
  • Vector Databases

Skills

AWSAPIsAsync processingCloud infrastructureData pipelinesEmbeddingsETLEvaluation frameworksKafkaLLMNoSQLObservabilityOrchestration toolsPythonRAGSQLStreaming systemsVector databasesVector search

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free