Large-Scale AI Model Research Engineer
Jobgether
About the role
About
Advance AI technologies as a Large-Scale AI Model Research Engineer. Focus on developing cutting-edge architectures for large-scale pre-training systems that enhance model capability and efficiency in a collaborative, remote-friendly environment.
This role is pivotal in next-generation AI model development, allowing you to engage in deep scientific exploration while also performing hands-on engineering. You will work on massive GPU clusters, designing and optimizing training pipelines that run efficiently across thousands of NVIDIA GPUs. Collaborate with world-class researchers and engineers to push the boundaries of modern AI systems and contribute to foundational breakthroughs.
Key Responsibilities:
- Conduct large-scale model pre-training on distributed GPUs
- Design and optimize novel model architectures
- Analyze results and refine training methodologies
- Identify training system bottlenecks and resolve issues
- Improve infrastructure for next-gen AI workloads
Requirements:
- PhD or equivalent experience in AI or related fields
- Hands-on experience with large-scale LLM pre-training
- Strong proficiency in PyTorch and Hugging Face
- Deep knowledge of transformer and model design
- Excellent debugging and collaboration skills
Drive cutting-edge AI innovations by leveraging complex GPU frameworks and robust collaboration with leading researchers in the field.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free