Skip to content
mimi

Sr. Data Engineer with GenAI

Jobs via Dice

Irving · Hybrid Full-time Senior 3w ago

About the role

Project Mission

To lead the design and execution of the data ingestion and preparation pipeline, which forms the knowledge foundation for the AI system. This role is responsible for ensuring the AI has access to accurate, well-structured, and contextually rich metadata.

Key Responsibilities

  • Architect and orchestrate the entire data and metadata ingestion process for the AI's knowledge base.
  • Design and oversee the development of scripts and processes to extract schema definitions, query logs, business glossaries, and other metadata from source systems.
  • Lead a team of data engineers in performing data transformation, cleansing, and formatting to prepare it for vectorization.
  • Collaborate with the Architect to design the optimal data model for the vector database.
  • Serve as the subject matter expert on data sources, liaising with data owners to understand structures, access patterns, and semantics.
  • Ensure the data ingestion pipeline is robust, repeatable, and well-documented.

Required Skills & Experience

  • Senior-level experience in data engineering, including designing and building complex ETL/ELT pipelines.
  • Expert-level proficiency in SQL and Python for data processing and automation.
  • Hands-on experience with vector databases (e.g., Pinecone, Chroma, PGvector) and the concept of embeddings.
  • Experience working with a variety of data sources, from structured databases to semi-structured API outputs.
  • Strong leadership and mentorship skills to guide a remote or distributed team.

Skills

AWS LambdaChromaDockerETLGenAIPineconePythonPGvectorSQLVector Databases

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free