Data Engineer / AI Engineer (Agentic AI Platform – Financial Data)

Jobs via Dice

Philadelphia · Hybrid Contract 1mo ago

About the role

About the Role:

We are building a platform that converts unstructured financial data (emails, corporate actions, index announcements) into high-quality, structured datasets used by financial institutions.

This is not a typical “LLM wrapper” role.

You will work on systems that:

Extract data from noisy, inconsistent sources
Validate and reconcile outputs across multiple inputs
Ensure correctness, traceability, and auditability

The challenge is not just applying LLMs—it’s making them reliable in production for financial workflows.

What You’ll Work On

Designing pipelines that process high-volume financial documents (batch + near real-time)
Building LLM-powered extraction workflows (classification, parsing, summarization)
Implementing validation layers (rule-based + model-based) to reduce hallucinations
Developing retrieval systems using embeddings and vector search
Architecting end-to-end systems: ingestion → processing → storage → serving
Ensuring data quality, observability, and fault tolerance
Collaborating with product to turn messy data into usable financial intelligence

Core Requirements

Strong Python and backend/data engineering experience
Experience building production data pipelines (ETL, streaming, or async systems)
Solid understanding of distributed systems and failure modes
Experience working with LLM-based systems in production:
- Prompt design
- Output validation
- Retry/fallback strategies
- Evaluation and monitoring
Experience with data storage systems (SQL + NoSQL)
Familiarity with cloud infrastructure (AWS or similar)

Preferred Experience

Experience with RAG / vector search systems
Background in financial data or capital markets
Experience with streaming systems (Kafka, etc.)
Experience building multi-step or agent-style workflows

What Makes This Role Interesting

Work on high-accuracy AI systems where correctness matters
Solve real problems around:
- LLM reliability and hallucination mitigation
- Data consistency across conflicting sources
- Real-time vs correctness tradeoffs
Build systems used in financial decision-making workflows
High ownership over core architecture in an early-stage environment

Nice To Know (but Not Required)

Experience with orchestration tools (Airflow, etc.)
Exposure to evaluation frameworks for LLMs
Experience working with large-scale document processing

Tech Stack (Representative, not exhaustive)

Python, APIs, async processing
LLM APIs + embeddings
SQL / NoSQL databases
Cloud infrastructure (AWS)
Data pipelines and streaming systems
Vector Databases

Skills

AWSAPIsAsync processingCloud infrastructureData pipelinesEmbeddingsETLKafkaLLMNoSQLObservabilityPythonRAGSQLStreaming systemsVector databasesVector search

Similar roles

Business Analyst Kreditkartenverwaltung / Issuing (m/w/d) WFOH1_DE

Sopra Steria

(Senior) Software Engineer

SIX Group Services Ltd.

Mid-Level IoT Engineer

Cosmoquick

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free