📍 Local Job Near You

Senior Software Engineer, AI Inference Systems

🏢

NVIDIA

📍 Toronto, Canada

📍

Location Toronto

📅

Posted June 03, 2026

🚗

Commute Local Area

🎯

Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

📋
Job Description

                    We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and compilers, drive industry benchmarks, and scale workloads across multi-GPU, multi-node, and multi-cloud environments. You’ll collaborate across inference, compiler, scheduling, and performance teams to push the frontier of accelerated computing for AI. 
  
What you’ll be doing:
+ Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.
+ Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high...

Apply for This Job

Submit Application

Quick and secure application process

📍 Location Details

🌆

City

Toronto

🗺️

Country

Canada

🚗

Commute

Local Area

🔍 More Jobs Nearby

Explore other opportunities in Toronto

View Local Jobs

Senior Software Engineer, AI Inference Systems

📋 Job Description

Apply for This Job

📍 Location Details

🔍 More Jobs Nearby

📋
Job Description