Data Scientist - Support Top-Tier Entrepreneurs

RayAI Inc.

Remote · South Africa Full-time 3mo ago

About the role

A Message from Our CEO

We are seeking a highly skilled Data Scientist with deep expertise in modern AI/ML systems, including LLMs, multimodal models, fine-tuning techniques, and advanced retrieval architectures. In this role, you will design, prototype, and deploy AI-powered solutions that leverage state-of-the-art language, vision, and agentic frameworks. You will work closely with engineering, product, and research teams across the US and Europe to bring cutting-edge AI capabilities into production environments.

Responsibilities

Design, build, and optimize LLM-powered systems using OpenAI, Anthropic, and open-source/local model families.
Architect and implement RAG pipelines, including hybrid search, query rewriting, prompt optimization, and reranking strategies.
Develop and maintain vector database infrastructures (Pinecone, Weaviate, Qdrant) for large-scale embedding storage and fast retrieval.
Train, evaluate, and retrain embedding models for domain-specific semantic search and knowledge retrieval.
Build and integrate multimodal AI solutions using OCR, CLIP, and modern vision architectures for text-image understanding.
Apply fine-tuning techniques (LoRA/QLoRA) to adapt foundation models to organizational datasets and specialized tasks.
Develop production‑ready AI applications using Python, PyTorch, and modern orchestration frameworks.
Implement LLM orchestration with LangChain or LlamaIndex, including evaluators, tool abstractions, memory, and RAG components.
Establish robust evaluation frameworks to measure model performance, reduce hallucination, and ensure reliability in production.
Build agentic workflows using AutoGen, CrewAI, or similar frameworks to power automation and multi‑agent collaboration systems.
Stay current with research trends and apply theoretical and practical insights in Generative AI to drive innovation across the organization.

Requirements

Experience in applied machine learning or data science, with at least 2 years focused specifically on LLMs or Generative AI.
Demonstrated experience building end‑to‑end RAG, fine‑tuning, or multimodal AI systems.
Strong proficiency in Python, PyTorch, and AI tooling ecosystems.
Experience deploying models at scale in production environments.
Strong understanding of evaluation metrics, model reliability, and safety/reduction of hallucination.
Familiarity with vector embeddings, vector databases, and semantic search.
Experience with agent frameworks such as AutoGen, CrewAI, or LangGraph‑like toolkits.
Experience with distributed training, model optimization, quantization, or GPU acceleration.
Knowledge of DevOps/MLOps tooling for deploying LLM‑based systems.
Contributions to open‑source LLM or RAG projects.

What We Offer

Competitive salary and performance‑based bonuses.
Fully remote, flexible work environment.
Modern laptop and hardware provided by us.
Specialized training in AI, automation, and digital productivity tools.
Global exposure—collaborate with top‑tier founders and fast‑growing startups.
Continuous learning and career growth opportunities in an international environment.

Skills

AutoGenCLIPCrewAIGenerative AILangChainLangGraphLLMLlamaIndexLoRAMLOpsOCROpenAIPineconePythonPyTorchQLoRARAGWeaviate

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Data Scientist - Support Top-Tier Entrepreneurs

About the role

A Message from Our CEO

Responsibilities

Requirements

What We Offer

Skills

Similar roles

MCP Engineer / AI Backend Engineer

Senior Machine Learning Engineer

AI Forward Deploy Engineer

Don't send a generic resume