Senior Data Scientist (GenAI)

Acuity Analytics

US · On-site Full-time 1w ago

About the role

The data-driven mind behind every breakthrough.
Put your skills to the test to build AI solutions that shape the way businesses operate and innovate.

About Us

Ascent has recently been acquired by Acuity Analytics. This is both a significant milestone for us and a tremendous opportunity for you. Acuity Analytics is a business with a strong global reputation, an impressive client base and ambitious growth plans.

We deliver deep insights and domain-led digital transformation to high-growth and heavily regulated organisations.
To our customers, we bring a partnership that provides the talent, technology and capability to enhance performance and operational efficiency.

About the Role

Join the AI Platform team to drive RAG retrieval relevance and LLM answer quality across emails and attachments.

Define evaluation datasets, labelling pipelines, metrics, and experiments to tune chunking, embeddings (Flamingo API), hybrid search, filtering, and citation grounding.

Your insights will directly improve model quality, retrieval performance, and end-user impact.

About You

Detail-oriented, hands-on, and metric-driven.
Curious about LLMs, vector search, and retrieval systems.
Collaborative with engineers, ETL teams, and business users.
Comfortable designing evaluation pipelines and experiments.
Strong communicator: turns feedback into actionable insights.

What You’ll Do

Define evaluation sets across clients, content types, and languages.
Build labelling pipelines and dashboards for manual or UI-based feedback.
Measure and tune hybrid retrieval (semantic + keyword), top‑k, rerankers, and filters.
Evaluate embedding models and chunking strategies for accuracy and coverage.
Assess LLM answer quality: grounded factuality, hallucination rate, completeness.
Analyze failure patterns across queries, long contexts, attachments vs. emails.
Collaborate with backend, ETL, and frontend teams to align telemetry, schema, and feedback capture.
Translate user feedback into actionable metrics and backlog items.
Evaluate cost/performance trade‑offs for summarization, spillover, and indexing.

Required Skills

Applied ML/Data Science with strong evaluation discipline.
Python (pandas, Jupyter/Notebooks) + SQL.
Information retrieval / RAG evaluation: Recall@K, MRR, nDCG, citation grounding.
Experience with LLMs: prompting, grounding, long‑context handling, cost/latency awareness.
Building reproducible pipelines and dashboards.
Strong communicator: translating business questions into metrics and experiments.

Nice-to-Have

Hands‑on with Azure AI Search vector/hybrid tuning, index schema.
Experience designing human‑in‑the‑loop annotation pipelines.
Knowledge of enterprise access‑control / security trimming.
Experience tracking embedding drift and retrieval failures.
Working with large‑scale document/email datasets.

Why Join Us

People are at the Heart of our Business. By investing in people, we achieve exceptional results for our clients and create new opportunities for our teams to thrive. Check out this page for more details.

If you have any questions contact our Talent Acquisition team at ta.admin@acuityanalytics.

For more details about life at Ascent, check out our Life Page here.

Ascent is now an Acuity Analytics. Read here.

Requirements

Applied ML/Data Science with strong evaluation discipline.
Python (pandas, Jupyter/Notebooks) + SQL.
Information retrieval / RAG evaluation: Recall@K, MRR, nDCG, citation grounding.
Experience with LLMs: prompting, grounding, long-context handling, cost/latency awareness.
Building reproducible pipelines and dashboards.
Strong communicator: translating business questions into metrics and experiments.

Responsibilities

Define evaluation sets across clients, content types, and languages.
Build labelling pipelines and dashboards for manual or UI-based feedback.
Measure and tune hybrid retrieval (semantic + keyword), top-k, rerankers, and filters.
Evaluate embedding models and chunking strategies for accuracy and coverage.
Assess LLM answer quality: grounded factuality, hallucination rate, completeness.
Analyze failure patterns across queries, long contexts, attachments vs. emails.
Collaborate with backend, ETL, and frontend teams to align telemetry, schema, and feedback capture.
Translate user feedback into actionable metrics and backlog items.
Evaluate cost/performance trade-offs for summarization, spillover, and indexing.

Skills

Azure AI SearchDockerFlamingo APIJupyterLLMsPandasPythonRAGSQLVector search

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free