π Local Job Near You
Principal High-Performance LLM Training Engineer
NVIDIA
π
Santa Clara, United States
Location
Santa Clara
Posted
June 06, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
NVIDIA is seeking a Principal Engineer to drive the performance of large-scale AI training and post-training workloads across NVIDIAβs full hardware and software stack. This role sits at the intersection of distributed training, GPU architecture, systems software, deep learning frameworks, and performance engineering. You will analyze and optimize frontier-scale LLM workloads running on thousands of GPUs, drive improvements across frameworks such as PyTorch, JAX, NeMo, and NeMo RL, and use insights from real workloads to help shape future NVIDIA GPU, system, and software roadmaps.
We are looking for a deeply technical leader who can operate across abstraction layers: from application-level training behavior to framework/runtime internals, CUDA libraries, communication collectives, memory systems, networking, and GPU architecture. At this level, success means both directly improving performance directly as well as setting technical direction, raising the bar for the org...
We are looking for a deeply technical leader who can operate across abstraction layers: from application-level training behavior to framework/runtime internals, CUDA libraries, communication collectives, memory systems, networking, and GPU architecture. At this level, success means both directly improving performance directly as well as setting technical direction, raising the bar for the org...