📍 Jobs Near Me
📍

HiringNearMe.work

Local Jobs, Zero Commute

📍 Local Job Near You

Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

🏢
NVIDIA
📍 Santa Clara, United States
📍
Location Santa Clara
📅
Posted June 03, 2026
🚗
Commute Local Area
🎯
Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

📋
Job Description

NVIDIA Dynamo is a high-throughput, low-latency inference framework for serving generative AI and reasoning models across multi-node distributed environments. Built in Rust for performance and Python for extensibility, Dynamo orchestrates GPU shards, routes requests, and manages shared KV cache across heterogeneous clusters so that many accelerators feel like a single system at datacenter scale. As large language models rapidly outgrow the memory and compute budget of any single GPU, this platform enables efficient, resilient deployment of cutting-edge LLM workloads.

We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale LLM and storage systems.

What you'll be doing:
+ Design and evolve a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote file/object/cloud storage to support large-scale LLM inference.
+ Architect and implement deep integrations w...

Apply for This Job

Submit Application

Quick and secure application process

📍 Location Details

🌆
City
Santa Clara
🗺️
Country
United States
🚗
Commute
Local Area

🔍 More Jobs Nearby

Explore other opportunities in Santa Clara

View Local Jobs