π Local Job Near You
Senior Applied ML Researcher - Video Apps
Apple
π
Cupertino, United States
Location
Cupertino
Posted
June 10, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
**Weekly Hours:** 40
**Role Number:** 200640997-0836
**Summary**
We are seeking a Senior Applied ML Researcher to design, train, and deploy state-of-the-art models for visual and audio understanding. You will work on challenging problems at the intersection of computer vision, audio signal processing, and multimodal learning, enabling intelligent systems that can see, hear, and reason about the world.
You will collaborate closely with research scientists, engineers, and product teams to find novel applications of Deep Machine Learning capabilities to assist our creative user base. Your mission is to elevate the workflows of millions of creators by combining generative AI with Appleβs human-centered design principles.
**Description**
Design and train deep neural networks for video, image, audio, and audio-visual tasks.
Build models for audio-visual representation learning, cross-modal alignment, and fusion.
Develop solutions for tasks such as:
Vi...
**Role Number:** 200640997-0836
**Summary**
We are seeking a Senior Applied ML Researcher to design, train, and deploy state-of-the-art models for visual and audio understanding. You will work on challenging problems at the intersection of computer vision, audio signal processing, and multimodal learning, enabling intelligent systems that can see, hear, and reason about the world.
You will collaborate closely with research scientists, engineers, and product teams to find novel applications of Deep Machine Learning capabilities to assist our creative user base. Your mission is to elevate the workflows of millions of creators by combining generative AI with Appleβs human-centered design principles.
**Description**
Design and train deep neural networks for video, image, audio, and audio-visual tasks.
Build models for audio-visual representation learning, cross-modal alignment, and fusion.
Develop solutions for tasks such as:
Vi...