Intern Engineer – RL Post-Training for LLMs

Huawei Technologies Canada Co., Ltd.

1 day ago

Internship

On-site

Vancouver, 02

$58,000 - $104,000 CAD yearly

JobsCloseBy Editorial Insights

Huawei Canada is offering an onsite six to twelve month internship in Vancouver for an Intern Researcher in the Computing Data Application Acceleration Lab. You will develop and optimize RL post-training pipelines for LLMs, explore GRPO and reward modeling, run experiments to improve performance and alignment, and help build scalable training, evaluation, and data generation systems while collaborating with researchers on cutting edge LLM projects. The ideal candidate is a Master or PhD student in computer science or AI with a solid ML, RL, and deep learning background, familiarity with transformer architectures and LLM frameworks, and proficiency in Python and PyTorch. Show strong problem solving and communication; highlight open source work or RL infrastructure experience. Huawei reviews applications directly; no AI screening, and accommodations are available.

Huawei Canada has an immediate 6-12 months internship opening for an Intern Researcher.

About the team:

The Computing Data Application Acceleration Lab aims to create a leading global data analytics platform organized into three specialized teams using innovative programming technologies. This team focuses on full-stack innovations, including software-hardware co-design and optimizing data efficiency at both the storage and runtime layers. This team also develops next-generation GPU architecture for gaming, cloud rendering, VR/AR, and Metaverse applications. One of the goals of this lab are to enhance algorithm performance and training efficiency across industries, fostering long-term competitiveness.

About the job:

Develop and optimize RL post-training pipelines for LLMs (e.g., GRPO, reward modeling).
Conduct experiments to improve model performance, reasoning, and alignment.
Build scalable training, evaluation, and data generation systems.
Collaborate with researchers and engineers on cutting-edge LLM projects
Stay current with advancements in RL, LLMs, and post-training research.

The total target annual compensation (based on 2,080 hours per year) ranges from $58,000 to $104,000 depending on education, experience, and demonstrated expertise.

Requirements

About the ideal candidate:

Enrolled as Master or Ph.D. student in Computer Science, AI, or related field.
Strong background in machine learning, reinforcement learning, and deep learning. Familiarity with Large Language Models, transformer architectures, and post-training methods.
Proficiency in Python, PyTorch, and LLM frameworks.
Hands-on experience with LLMs and RL training algorithms (e.g., GRPO) is an asset.
Familiarity with RL frameworks, such as VeRL.
Experience with open-source LLM frameworks such as Hugging Face, DeepSpeed, vLLM, or SGLang is an asset.
Knowledge of domain-specific languages used with AI accelerators.
Experience with distributed training frameworks, large-scale experimentation, or LLM infrastructure is an asset.
Strong problem-solving and communication skills

Additional Information：

Huawei Canada is committed to a fair, inclusive, and accessible recruitment process. If you require accommodation during any stage of the hiring process, please let us know and we will work with you to meet your needs.

All applications for this position are reviewed directly by our hiring team, we do not use artificial intelligence tools to screen or select candidates.

Apply now

Intern Engineer – RL Post-Training for LLMs

JobsCloseBy Editorial Insights

Requirements

More jobs

Part-Time Member Experience Coordinator - Main Street

Juno Veterinary

Senior Structural Geologist

Teck Resources Limited