📍 Jobs Near Me
📍

HiringNearMe.work

Local Jobs, Zero Commute

📍 Local Job Near You

Senior Software Engineer – TensorRT Edge-LLM

🏢
NVIDIA
📍 Santa Clara, United States
📍
Location Santa Clara
📅
Posted June 03, 2026
🚗
Commute Local Area
🎯
Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

📋
Job Description

Are you passionate about pushing the limits of real-time large language model inference? Join NVIDIA’s TensorRT Edge-LLM team and help shape the next generation of edge AI for automotive and robotics. We build the software stack that enables Large Language, Vision-Language, and Multimodal (LLM/VLM/VLA) models to run efficiently on embedded and edge platforms — delivering cutting-edge generative AI experiences directly on-device.


What you’ll be doing:
+ Develop and evolve a state-of-the-art inference framework in modern C++ that extends TensorRT with autoregressive model serving capabilities, including speculative decoding, LoRA, MoE, and KV cache management.
+ Design and implement compiler and runtime optimizations tailored for transformer-based models running on constrained, real-time platforms.
+ Collaborate with teams across CUDA, kernel libraries, compilers, and robotics to deliver high-performance, production-ready solutions.
+ Contribute to CUDA kernel...

Apply for This Job

Submit Application

Quick and secure application process

📍 Location Details

🌆
City
Santa Clara
🗺️
Country
United States
🚗
Commute
Local Area

🔍 More Jobs Nearby

Explore other opportunities in Santa Clara

View Local Jobs