Skip to content
mimi

Software Developer / Engineer (Mid-Level)

Real Soft, Inc / Diversity Direct

Philadelphia · Hybrid Contract Mid Level 2w ago

About the role

Role Overview

We are seeking a mid-level Software Developer/Engineer with hands-on experience in deploying on-premise LLM solutions and vector databases. The ideal candidate will have strong expertise in Python, RAG pipelines, and enterprise-grade AI system implementation within secure environments.

Key Responsibilities

  • Deploy and manage open-source LLMs (e.g., Llama 3, Mistral/Mixtral) in on-prem or private environments
  • Develop and optimize LLM inference workflows using Python
  • Implement Retrieval-Augmented Generation (RAG) pipelines
  • Design and integrate vector database solutions for efficient semantic search
  • Perform model quantization and performance tuning for CPU-based inference
  • Ensure data privacy, security, and governance compliance in enterprise environments
  • Implement access controls, logging, and monitoring mechanisms
  • Deliver reference architecture, prototypes, and technical documentation
  • Collaborate with internal teams for knowledge transfer and system adoption

Required Skills & Qualifications

  • Strong experience with Python for AI/ML and backend development
  • Hands-on experience with open-source LLM deployment (Llama 3, Mistral, Mixtral)
  • Experience with CPU-based inference and optimization techniques
  • Practical experience with vector databases (Qdrant, Chroma, Milvus, pgvector)
  • Proven experience building RAG pipelines
  • Knowledge of embeddings, similarity search, and metadata filtering
  • Understanding of enterprise security, data privacy, and air-gapped environments

Preferred Qualifications

  • Experience with LangChain or LlamaIndex
  • Familiarity with Docker and Kubernetes
  • Exposure to Rust, Go, or C++ for high-performance systems
  • Experience with LLM inference frameworks (vLLM, llama.cpp, Hugging Face Transformers)
  • Prior experience in regulated or enterprise environments

Deliverables

  • End-to-end reference architecture for LLM + vector DB solutions
  • Functional prototype (LLM + RAG + Vector DB)
  • Comprehensive documentation and knowledge transfer

Skills

ChromaDockerGoHugging Face TransformersKubernetesLangChainLlama 3LlamaIndexLlama.cppMilvusMistralMixtralPgvectorPythonQdrantRAGRustvLLM

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free