Lead Data Scientist and Machine Learning Lead
Eucloid Data Solutions
About the role
Job Description
We are looking for a Lead Data Scientist – Vision & Multimodal AI to architect and build next-generation Vision-Language Model (VLM) systems at scale.
This role requires deep expertise in:
• Architecting and implementing RLHF (Reinforcement Learning from Human Feedback) Frameworks.
• Training and fine-tuning Open-Source Vision-Language Models (VLMs).
• Deploying and scaling multimodal models to production serving millions of requests.
Key Responsibilities
Architect & Build RLHF Frameworks
• Design end-to-end RLHF pipelines (SFT → Reward Modeling → PPO/DPO)
• Develop scalable human feedback collection systems
• Implement preference modeling and ranking pipelines
• Optimize reward models for multimodal outputs (image + text)
• Build automated evaluation frameworks
Train & Fine-Tune OSS Vision-Language Models
• Experience working with Qwen-VL, Llama, GPT OSS
• Pretraining / instruction tuning multimodal models
• Parameter-efficient fine-tuning (LoRA, QLoRA)
• Dataset curation & synthetic data generation
• Scaling training on multi-GPU / multi-node clusters
• Optimizing for alignment, hallucination reduction, and safety
Highly Scalable Deployment of VLM Systems
• Design distributed inference pipelines (GPU-optimized)
• Model serving using vLLM and Triton Inference Server
• Optimize latency, throughput, and cost
• Implement batching, KV caching, quantization, tensor parallelism
• Deploy on Kubernetes-based infrastructure
• Build monitoring for drift, performance, and hallucinations
Multimodal AI System Design
• Architect systems combining OCR, vision encoders, LLMs, retrieval
• Implement retrieval-augmented multimodal pipelines
• Design evaluation benchmarks for VQA, grounding, and reasoning
• Ensure model safety and guardrails
Technical Leadership
• Lead a team of ML engineers & research scientists
• Define technical roadmap for multimodal AI
• Review model architectures & code quality
• Collaborate with product and infrastructure teams.
Qualifications
• 6+ years in ML / AI
• 2+ years working with large-scale LLM or VLM systems
• Strong hands-on experience building RLHF pipelines (not just using libraries)
• Deep PyTorch expertise
• Experience training models 7B parameters
• Experience with distributed training (Deep Speed, FSDP)
• Production-grade deployment experience handling 10k+ QPS workloads
• Strong understanding of transformer architectures.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free