Staff Software Engineer, ML Frameworks & Efficiency

Mountain View, California

Job Description

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo One, a fully autonomous ride-hailing service, and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over one million rider-only trips, enabled by its experience autonomously driving tens of millions of miles on public roads and tens of billions in simulation across 13+ U.S. states.

The Waymo ML Frameworks & Efficiency team works with Research and Production teams to develop and deploy models in Perception and Planning that are core to our autonomous driving software. We help our partners by offering the best frameworks for the entire model development lifecycle and efficiency solutions for model execution. They are geared towards both scaling models and solving problems unique to ML for autonomous driving.

We are looking for engineers with ML frameworks or ML systems expertise to help us improve compute efficiency on both cloud and car. You’ll work across the entire ML stack, from deep learning model architectures, ML frameworks (e.g. JAX, XLA, etc.), to accelerator runtime. You will work closely with ML modeling teams to drive large scale and efficient model training and inference.

You Will:

Develop new neural model architectures (e.g., sparse architectures), decoding strategies (e.g., speculative decoding), etc. for improving training/inference performance on modern TPU and GPU architectures.
Improve accelerator FLOPS efficiency of ML workload, including improving compiler optimizations (e.g. XLA), authoring low-level kernels (e.g. Pallas, Triton, etc.) and enabling low-precision computation.
Optimizing ML systems for high performance on TPUs and GPUs clusters, including reducing communication overhead and memory consumption, ensuring scalability and reliability across distributed environments.
Evaluate and integrate open source community and Google SOTA technologies to enhance the performance and scalability of ML workloads.
Promote best practices for distributed systems architecture and contribute to technical leadership within the team.

You Have:

B.S. in Computer Science, Math, or 8+ years equivalent real-world experience.
Proficient in distributed systems design with an understanding of ML efficiency.
Experience with ML frameworks, including TensorFlow, JAX, XLA.
Solid programming skills in Python and C++.
Practical familiarity with profiling tools to uncover performance bottlenecks.

We Prefer:

MS in Computer Science, Math
Familiarity with ML frameworks like Pallas and Triton

The expected base salary range for this full-time position across US locations is listed below. Actual starting pay will be based on job-related factors, including exact work location, experience, relevant training and education, and skill level. Your recruiter can share more about the specific salary range for the role location or, if the role can be performed remote, the specific salary range for your preferred location, during the hiring process.

Waymo employees are also eligible to participate in Waymo’s discretionary annual bonus program, equity incentive plan, and generous Company benefits program, subject to eligibility requirements.

Salary Range

$238,000—$302,000 USD

Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!