All jobs · Machine Learning Engineer jobs

II

Generative AI Engineer

Infojini Inc

Philadelphia · On-site Contract 3mo ago

Apply with a tailored resume Save job

About the role

Core Experience

Hands-on experience deploying open-source LLMs such as Meta Llama 3 and Mistral / Mixtral in on-prem or private environments
Strong proficiency in Python for LLM inference, prompt engineering, and integration
Experience with CPU-based inference, model quantization, and performance tuning

Vector Databases & RAG

Practical experience with open-source vector databases such as Qdrant, Chroma, Milvus, or pgvector
Proven implementation of Retrieval-Augmented Generation (RAG) pipelines
Experience in generating and managing embeddings and metadata filtering

Security & Governance

Understanding of data privacy, air-gapped deployments, and enterprise security requirements
Experience implementing access controls and audit logging

Nice to Have

Experience with LangChain or LlamaIndex
Exposure to Rust, Go, or C++ for high-performance services
Familiarity with Docker and Kubernetes for on-prem deployments
Knowledge of inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)
Prior work in regulated or enterprise environments

Deliverables

Reference architecture and deployment guidance
Working prototype (LLM + vector DB + RAG)
Documentation and knowledge transfer to internal teams

Skills

C++ChromaDockerGoHugging Face TransformersKubernetesLangChainLlama 3LlamaIndexllama.cppMilvusMistralMixtralpgvectorPythonQdrantRustvLLM

Similar roles

MCP Engineer / AI Backend Engineer

Ruby Labs

Software Engineer

Google

$147k – $211k/yr

Senior Database Engineer

Glencore AG

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free