Skip to content
mimi

Senior ML Engineer

Recro

Jaipur · On-site Full-time Senior Yesterday

About the role

About the Role

We are looking for a highly skilled Senior Machine Learning Engineer to build and scale next-generation generative AI systems. This role sits at the intersection of machine learning and backend infrastructure, focusing on taking advanced models from experimentation to reliable, high-performance production systems.

You will work on cutting-edge generative video and multimodal AI use cases, contributing to scalable, low-latency systems used by millions of users globally.

Key Responsibilities Design, train, fine-tune, and evaluate generative and multimodal models (e.g., text-to-video, image-to-video, lip-sync, character consistency) Build and manage end-to-end ML pipelines, including data ingestion, preprocessing, training, evaluation, and model versioning Deploy and maintain scalable ML systems, including model serving, containerization, and GPU-optimized inference Implement MLOps best practices such as experiment tracking, model monitoring, drift detection, and A/B testing Optimize inference systems for low latency, high throughput, and cost-efficient GPU utilization Develop batching and caching strategies to meet production SLAs Collaborate with backend and platform teams to integrate ML services into distributed systems Contribute to long-term AI strategy, including foundational model training and fine-tuning pipelines

Required Qualifications 4–10 years of experience in Machine Learning or Applied ML Engineering Strong fundamentals in deep learning, Transformers, and generative model architectures Hands-on experience with large-scale model training and fine-tuning (e.g., LoRA, full fine-tuning) Proven experience in deploying and scaling ML models in production environments Strong understanding of MLOps practices and tools (e.g., MLflow, Weights & Biases) Experience with model serving frameworks such as Triton, TorchServe, vLLM, or similar Proficiency in Python and frameworks like PyTorch Experience working with cloud platforms (AWS, GCP, or Azure), including GPU provisioning and autoscaling Ability to work in fast-paced, ambiguous environments with cross-functional teams

Preferred Qualifications Experience with video generation, diffusion models, or multimodal architectures Familiarity with LoRA/IC-LoRA techniques for character or identity consistency Knowledge of inference optimization techniques such as quantization (FP8/INT8), batching, and GPU memory management Experience with audio/video systems (e.g., TTS, voice cloning, lip-sync pipelines) Background in media, OTT, or large-scale content platforms

What We Offer Competitive compensation Opportunity to work on cutting-edge AI products at scale High-impact role with ownership across the ML lifecycle Collaborative and fast-paced work environment Continuous learning and growth opportunities

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free