RS
Software Developer / Engineer (Mid-Level)
Real Soft, Inc / Diversity Direct
Philadelphia · Hybrid Contract Mid Level 2w ago
About the role
Role Overview
We are seeking a mid-level Software Developer/Engineer with hands-on experience in deploying on-premise LLM solutions and vector databases. The ideal candidate will have strong expertise in Python, RAG pipelines, and enterprise-grade AI system implementation within secure environments.
Key Responsibilities
- Deploy and manage open-source LLMs (e.g., Llama 3, Mistral/Mixtral) in on-prem or private environments
- Develop and optimize LLM inference workflows using Python
- Implement Retrieval-Augmented Generation (RAG) pipelines
- Design and integrate vector database solutions for efficient semantic search
- Perform model quantization and performance tuning for CPU-based inference
- Ensure data privacy, security, and governance compliance in enterprise environments
- Implement access controls, logging, and monitoring mechanisms
- Deliver reference architecture, prototypes, and technical documentation
- Collaborate with internal teams for knowledge transfer and system adoption
Required Skills & Qualifications
- Strong experience with Python for AI/ML and backend development
- Hands-on experience with open-source LLM deployment (Llama 3, Mistral, Mixtral)
- Experience with CPU-based inference and optimization techniques
- Practical experience with vector databases (Qdrant, Chroma, Milvus, pgvector)
- Proven experience building RAG pipelines
- Knowledge of embeddings, similarity search, and metadata filtering
- Understanding of enterprise security, data privacy, and air-gapped environments
Preferred Qualifications
- Experience with LangChain or LlamaIndex
- Familiarity with Docker and Kubernetes
- Exposure to Rust, Go, or C++ for high-performance systems
- Experience with LLM inference frameworks (vLLM, llama.cpp, Hugging Face Transformers)
- Prior experience in regulated or enterprise environments
Deliverables
- End-to-end reference architecture for LLM + vector DB solutions
- Functional prototype (LLM + RAG + Vector DB)
- Comprehensive documentation and knowledge transfer
Skills
ChromaDockerGoHugging Face TransformersKubernetesLangChainLlama 3LlamaIndexLlama.cppMilvusMistralMixtralPgvectorPythonQdrantRAGRustvLLM
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free