Senior Solutions Architect, Generative AI
Nvidia
About the role
About the Role
As a Generative AI Solution Architect at NVIDIA, you will play a crucial role in architecting cutting‑edge solutions that leverage NVIDIA's generative AI technologies, with a focus on Large Language Models (LLMs) and Retrieval‑Augmented Generation (RAG) workflows.
If you stand out from the crowd by optimizing LLM models for speed, memory efficiency, and resource utilization, and have familiarity with containerization technologies and GPU cluster architecture, NVIDIA offers competitive salaries and a generous benefits package. NVIDIA is an equal opportunity employer that values diversity and does not discriminate based on various characteristics. Join us if you are a creative and autonomous engineer passionate about technology.
Responsibilities
- Architecting end‑to‑end generative AI solutions, specifically focusing on LLMs and RAG workflows.
- Collaborating closely with customers to understand their language‑related business challenges and designing tailored solutions.
- Supporting pre‑sales activities by collaborating with sales and business development teams, including delivering technical presentations and demonstrations of LLM and RAG capabilities.
- Providing feedback and contributing to the evolution of generative AI technologies by working closely with NVIDIA engineering teams.
- Engaging directly with customers to understand their language‑related requirements and challenges.
- Leading workshops and design sessions to define and refine generative AI solutions centered on LLMs and RAG workflows, as well as leading the training and optimization of Large Language Models using NVIDIA’s hardware and software platforms.
- Implementing strategies for efficient training of LLMs to achieve optimal performance.
- Designing and implementing RAG‑based workflows to enhance content generation and information retrieval.
- Integrating RAG workflows into customer applications and systems and staying informed about the latest developments in language models and generative AI technologies.
- Providing technical leadership and guidance on best practices for training LLMs and implementing RAG‑based solutions.
Repeated responsibilities (as originally provided):
- Architecting end‑to‑end generative AI solutions, specifically focusing on LLMs and RAG workflows.
- Collaborating closely with customers to understand their language‑related business challenges and designing tailored solutions.
- Supporting pre‑sales activities by collaborating with sales and business development teams, including delivering technical presentations and demonstrations of LLM and RAG capabilities.
- Providing feedback and contributing to the evolution of generative AI technologies by working closely with NVIDIA engineering teams.
- Engaging directly with customers to understand their language‑related requirements and challenges.
- Leading workshops and design sessions to define and refine generative AI solutions centered on LLMs and RAG workflows, as well as leading the training and optimization of Large Language Models using NVIDIA’s hardware and software platforms.
- Implementing strategies for efficient training of LLMs to achieve optimal performance.
- Designing and implementing RAG‑based workflows to enhance content generation and information retrieval.
- Integrating RAG workflows into customer applications and systems and staying informed about the latest developments in language models and generative AI technologies.
- Providing technical leadership and guidance on best practices for training LLMs and implementing RAG‑based solutions.
Qualifications
- Master's or Ph.D. in Computer Science, Artificial Intelligence, or equivalent experience.
- 5+ years of hands‑on experience in a technical role, specifically focusing on generative AI, with a strong emphasis on training Large Language Models (LLMs).
- Proven track record of successfully deploying and optimizing LLM models for inference in production environments.
- In‑depth understanding of state‑of‑the‑art language models such as GPT‑3, BERT, or similar architectures.
- Expertise in training and fine‑tuning LLMs using frameworks like TensorFlow, PyTorch, or Hugging Face Transformers.
- Proficiency in model deployment and optimization techniques for efficient inference on various hardware platforms, with a focus on GPUs.
- Strong knowledge of GPU cluster architecture and parallel processing for accelerated model training and inference.
- Excellent communication and collaboration skills to articulate complex technical concepts to technical and non‑technical stakeholders.
- Experience leading workshops, training sessions, and presenting technical solutions to diverse audiences.
Qualifications req
Requirements
- Master's or Ph.D. in Computer Science, Artificial Intelligence, or equivalent experience.
- 5+ years of hands-on experience in a technical role, specifically focusing on generative AI, with a strong emphasis on training Large Language Models (LLMs).
- Proven track record of successfully deploying and optimizing LLM models for inference in production environments.
- In-depth understanding of state-of-the-art language models such as GPT-3, BERT, or similar architectures.
- Expertise in training and fine-tuning LLMs using frameworks like TensorFlow, PyTorch, or Hugging Face Transformers.
- Proficiency in model deployment and optimization techniques for efficient inference on various hardware platforms, with a focus on GPUs.
- Strong knowledge of GPU cluster architecture and parallel processing for accelerated model training and inference.
- Excellent communication and collaboration skills to articulate complex technical concepts to technical and non-technical stakeholders.
- Experience leading workshops, training sessions, and presenting technical solutions to diverse audiences.
Responsibilities
- Architecting end-to-end generative AI solutions, specifically focusing on LLMs and RAG workflows.
- Collaborating closely with customers to understand their language-related business challenges and designing tailored solutions.
- Supporting pre-sales activities by collaborating with sales and business development teams, including delivering technical presentations and demonstrations of LLM and RAG capabilities.
- Providing feedback and contributing to the evolution of generative AI technologies by working closely with NVIDIA engineering teams.
- Engaging directly with customers to understand their language-related requirements and challenges.
- Leading workshops and design sessions to define and refine generative AI solutions centered on LLMs and RAG workflows.
- Leading the training and optimization of Large Language Models using NVIDIAs hardware and software platforms.
- Implementing strategies for efficient training of LLMs to achieve optimal performance.
- Designing and implementing RAG-based workflows to enhance content generation and information retrieval.
- Integrating RAG workflows into customer applications and systems.
- Staying informed about the latest developments in language models and generative AI technologies.
- Providing technical leadership and guidance on best practices for training LLMs and implementing RAG-based solutions.
Benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free