📍 Local Job Near You

Senior Performance Engineer - LLM Inference Frameworks

🏢

NVIDIA

📍 Yokneam, Israel

📍

Location Yokneam

📅

Posted June 22, 2026

🚗

Commute Local Area

🎯

Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

📋
Job Description

                    NVIDIA is hiring exceptional software engineers to build and optimize the core inference infrastructure for large language models. Join the TensorRT‑LLM team - the group defining how generative AI performs at global scale on NVIDIA GPUs. We’re looking for engineers who love squeezing every drop of throughput, memory efficiency, and scalability out of modern model runtimes. Your work will directly shape the frameworks behind state‑of‑the‑art LLM inference used across NVIDIA and the AI community. Join us to redefine what “fast” means for LLM inference - building the frameworks that power the next generation of generative AI at scale. 
  
  
What you'll be doing:
+ Design, implement, and optimize high‑performance inference pipelines for large language models running on GPUs
+ Profile and tune model execution across the stack - from scheduler design to kernel fusions and everything in-between
+ Design and experiment with memory management strategies for improved memory ba...
                

Apply for This Job

Submit Application

Quick and secure application process

📍 Location Details

🌆

City

Yokneam

🗺️

Country

Israel

🚗

Commute

Local Area

🔍 More Jobs Nearby

Explore other opportunities in Yokneam

View Local Jobs

Senior Performance Engineer - LLM Inference Frameworks

📋 Job Description

Apply for This Job

📍 Location Details

🔍 More Jobs Nearby

📋
Job Description