π Local Job Near You
Deep Learning Kernel Software Performance Architect
NVIDIA
π
Shanghai, China
Location
Shanghai
Posted
June 22, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
NVIDIA is seeking Software Performance Architects to optimize GPU kernel performance for state-of-the-art data-center platforms. We build automated, data-driven workflows to detect, explain, and prevent performance regressions across key deep learning workloads, partnering closely with kernel developers, compiler teams, infrastructure, and architecture/performance groups.
What you'll be doing:
+ Performance analysis + debugging
+ Validate and analyze performance of GPU-accelerated kernels and key deep learning building blocks.
+ Debug performance issues end-to-end: reproduce, isolate root causes, propose fixes or mitigation paths, and drive closure with the owning teams.
+ Build performance narratives using structured evidence: baselines, controlled comparisons, and regression attribution.
+ Automation + regression infrastructure (Python-heavy)
+ Develop and maintain Python-based automation for performance testing and analysisβusing modern AI-assisted ...
What you'll be doing:
+ Performance analysis + debugging
+ Validate and analyze performance of GPU-accelerated kernels and key deep learning building blocks.
+ Debug performance issues end-to-end: reproduce, isolate root causes, propose fixes or mitigation paths, and drive closure with the owning teams.
+ Build performance narratives using structured evidence: baselines, controlled comparisons, and regression attribution.
+ Automation + regression infrastructure (Python-heavy)
+ Develop and maintain Python-based automation for performance testing and analysisβusing modern AI-assisted ...