📍 Jobs Near Me
📍

HiringNearMe.work

Local Jobs, Zero Commute

📍 Local Job Near You

Solutions Architect, Inference Deployments

🏢
NVIDIA
📍 Santa Clara, United States
📍
Location Santa Clara
📅
Posted June 02, 2026
🚗
Commute Local Area
🎯
Local Opportunity Near You!

This job is in your area. Enjoy a short commute and work close to home.

📋
Job Description

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect focused on inference, you’ll collaborate closely with our engineering, DevOps, and customers to develop enterprise AI solutions. Together, we'll deliver generative AI to production!


What you'll be doing:
+ Build inference pipelines with tools like NVIDIA Dynamo, distributing tasks among GPU workers to improve efficiency.
+ Collaborate with DevOps teams to orchestrate disaggregated inference using Kubernetes for complex workloads.
+ Accelerate inference pipelines using TensorRT-LLM, vLLM, SGLang, and other backends to ensure seamless integration with disaggregated inference.
+ Provide mentorship and technical leadership to customers and internal teams, guiding them through the deployment of disaggregated inference systems and resolving complex issues.



What we need to see:...

Apply for This Job

Submit Application

Quick and secure application process

📍 Location Details

🌆
City
Santa Clara
🗺️
Country
United States
🚗
Commute
Local Area

🔍 More Jobs Nearby

Explore other opportunities in Santa Clara

View Local Jobs