📍 Local Job Near You
AI Infrastructure & Experience Engineer
FocusKPI Inc.
📍
Mountain View, United States
Location
Mountain View
Posted
June 08, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
FocusKPI is seeking an AI Infrastructure & Experience Engineer to join one of our clients, a high-tech SaaS company.
Work Location: Mountain View, CA (Onsite role, 5 days/week onsite)
Duration: 4-month contract
Pay Range: $70 - 79/hr
**No C2C resumes are considered**
Position Responsibilities:
+ Inference Optimization: Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.
+ Systems Engineering & CUDA: Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute.
+ Orchestration & Integration: Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.
+ Rapid Prototyping: Build functional, high-fidelity demos showcasing m...
Work Location: Mountain View, CA (Onsite role, 5 days/week onsite)
Duration: 4-month contract
Pay Range: $70 - 79/hr
**No C2C resumes are considered**
Position Responsibilities:
+ Inference Optimization: Deploy and tune multiple LLMs and generative multimodal models on local inference hardware. Optimize performance metrics (TTFT, tokens/sec) via model quantization, caching strategies, and architecture-specific adjustments.
+ Systems Engineering & CUDA: Leverage deep knowledge of the CUDA environment to build custom kernels, ensuring maximum utilization of the low-cost GPU compute.
+ Orchestration & Integration: Seamlessly bridge inference backends with orchestration layers (LiteLLM, Ollama, etc.) and frontends like OpenWebUI.
+ Rapid Prototyping: Build functional, high-fidelity demos showcasing m...