πŸ“ Jobs Near Me
πŸ“

HiringNearMe.work

Local Jobs, Zero Commute

πŸ“ Local Job Near You

Deep Learning Performance Architect, CUTLASS DSL

🏒
NVIDIA
πŸ“ Shanghai, China
πŸ“
Location Shanghai
πŸ“…
Posted June 10, 2026
πŸš—
Commute Local Area
🎯
Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

πŸ“‹
Job Description

Are you passionate about programming languages, compiler technology, and GPU performance? Do you want to help shape the future of high-performance kernel development for AI? We are looking for outstanding engineers to build CUTLASS DSL, a Python-native language for GPU kernel development, along with the MLIR dialects and lowering passes behind it. In this role, you will also help accelerate kernel compilation while delivering performance comparable to CUTLASS C++, enabling efficient hardware-software co-design for NVIDIA's next generation of AI platforms.



What you'll be doing:
+ Design, develop, and optimize C UTLASS DSL, a Python-native language for high-performance GPU kernel development
+ Build and advance the MLIR dialects, lowering passes, and code generation flows that power the C UTLASS DSL stack
+ Drive innovations that improve kernel compilation speed while maintaining performance on par with CUTLASS C+
+
+ Col...

Apply for This Job

Submit Application

Quick and secure application process

πŸ“ Location Details

πŸŒ†
City
Shanghai
πŸ—ΊοΈ
Country
China
πŸš—
Commute
Local Area

πŸ” More Jobs Nearby

Explore other opportunities in Shanghai

View Local Jobs