π Local Job Near You
On-prem Platform Engineer
Apolis
π
Charlotte, North Carolina, United States
Location
Charlotte, North Carolina
Posted
May 16, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
On-prem Platform Engineer
Location: Charlotte, NC
Key Skills:
Must-Have Skills (Mandatory Keywords)
LLM Inference & Optimization
- vLLM, TensorRT-LLM, Triton Inference Server, SGLang
- Inference optimization techniques:
- Continuous batching
- Speculative decoding
- KV cache / Prefix caching
- Model optimization:
- FP8, AWQ, GPTQ
Distributed & GPU Systems
- Tensor parallelism and large model scaling
- CUDA, NCCL, GPU architecture
- GPU partitioning & optimization (MIG)
Kubernetes & ML Serving
- Kubernetes-based ML serving...