Skip to content
mimi

Distinguished Engineer - AI Computing System

Huawei Canada

Canada · On-site Full-time Mid Level CA$172k – CA$230k/yr Today

About the role

About the Job:

Lead in training cluster software frameworks and technologies, focusing on large model training frameworks and key features for scenarios such as pre-training, post-training, and integrated training and inference.

Drive the company's large model training optimization efforts, leading the development of low-precision training, parallel strategy tuning, and resource optimization to promote commercial implementation.

Oversee the development of large model AI training frameworks, operator libraries, and acceleration features for training servers and super nodes, enhancing AI cluster computing efficiency.

Identify and collaborate on academic resources, standards, and patents in the large model training domain to support ongoing innovation and build long-term competitiveness.

Cultivate a team of technical experts and key personnel in AI training cluster frameworks and software optimization.

The base salary ranges from $172,000 to $230,000, depending on education, experience, and expertise.

Job Requirements:

  • Major in AI, computer science, software, automation, physics, mathematics, electronics, microelectronics, IT, or related fields, with over 5 years of R&D experience in large model training and optimization.
  • Proficient in large model structures like Deepseek and Llama, with deep expertise in training and inference optimization for fields like LLM, MoE, and multimodal learning.
  • Familiar with AI hardware architectures such as GPU and NPU, with experience in optimizing AI systems through software-hardware collaboration.
  • Experience in cluster and cloud computing, including software architecture design for cluster scheduling.
  • Strong research interest, learning ability, communication skills, and teamwork.

Additional Details:

  • Seniority level: Mid-Senior level
  • Employment type: Full-time
  • Job function: Engineering and IT
  • Industry: Telecommunications

Skills

AICloud ComputingDeepseekGPULLMLlamaMoEMultimodal LearningNPUSoftware ArchitectureTraining Optimization

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free