📍 Local Job Near You
Software Engineer (GPU Infrastructure, High Performance Computing)
Cohere
📍
toronto, Canada
Location
toronto
Posted
June 27, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
Requirements
- Deep expertise in ML/HPC infrastructure: Experience with GPU/TPU clusters, distributed training frameworks (JAX, PyTorch, TensorFlow), and high-performance computing (HPC) environments
- Kubernetes at scale: Proven ability to deploy, manage, and troubleshoot cloud-native Kubernetes clusters for AI workloads
- Strong programming skills: Proficiency in Python (for ML tooling) and Go (for systems engineering), with a preference for open-source contributions over reinventing solutions
- Low-level systems knowledge: Familiarity with Linux internals, RDMA networking, and performance optimization for ML workloads
- Research collaboration experience: A track record of working closely with AI researchers or ML engineers to solve infrastructure challenges
- Self-directed problem‑solving: The ability to identify bottlenecks, propose solutions, and drive impact in a fast‑paced environment
- If some of the above does...