Senior Engineering Manager, ML Platform

ZooxFoster City, CA
1d$317,000 - $370,000

About The Position

Zoox is on a mission to reimagine transportation and build autonomous robotaxis from the ground up that are safe, reliable, clean, and enjoyable for everyone. With bidirectional driving capabilities and four-wheel steering, our vehicle allows us to maneuver through compact spaces and change directions without needing to reverse. We are still in the early stages of deploying our robotaxis, and it is a great time to join Zoox and have a significant impact on executing this mission. Our growing Software Infrastructure engineering leadership team is looking for a Senior Engineering Manager, ML Platform. The centralized ML Platform team at Zoox plays a crucial role in enabling innovations across all our Autonomy and Data Science teams to develop and deploy models across our robotaxi and cloud infrastructure, and to work on cutting-edge training and inference optimization techniques. We are working on many interesting challenges to enable rapid experimentation and scale our multi-modal Foundation models and RL infrastructure, and ensure these models run efficiently on our vehicles, meeting our latency targets. You will get to work across all ML teams within Zoox - Perception, Prediction, Planner, Simulation, Collision Avoidance, and our Advanced Hardware Engineering group, and have the opportunity to significantly push the boundaries of how ML is practiced within Zoox. We build and operate the base layer of ML tools, deep learning frameworks, and inference libraries used by our applied research teams for in- and off-vehicle ML use cases. You will lead a team of strong software engineers and managers and act as a force multiplier for our internal customers. This team has many growth opportunities as we expand our robotaxi deployments and venture into new ML domains. If you want to learn more about our ML Infrastructure, here is one of our past talks at re:Invent.

Requirements

  • 10+ years of relevant experience, including 4+ years of management experience managing other managers and engineers.
  • Experience building user-friendly ML Infrastructure that enabled large-scale model training and high-throughput, low-latency serving use cases.
  • Experience with training frameworks like PyTorch, JAX, etc., leveraging GPUs for distributed model training.
  • Experience with GPU-accelerated inference using TensorRT, Ray Serve, or similar frameworks.

Responsibilities

  • Vision: Develop and execute a strategic vision for our ML training platform, ensuring scalability, reliability, and performance to support large-scale Foundation and RL models.
  • Technical acumen: Lead the design, implementation, and operation of a robust and efficient ML training platform to enable the training, experimentation, validation, and monitoring of ML models.
  • Hiring: Attract, hire, and inspire a diverse world-class engineering team, fostering a culture of innovation, collaboration, and excellence.
  • Partnership: Collaborate closely with cross-functional teams, including ML researchers, software engineers, data engineers, and hardware engineers to define requirements and align on architectural decisions.
  • Mentorship: Enable the engineers in the team to grow their careers by providing the right opportunities along with clear and timely feedback.

Benefits

  • paid time off (e.g. sick leave, vacation, bereavement)
  • unpaid time off
  • Zoox Stock Appreciation Rights
  • Amazon RSUs
  • health insurance
  • long-term care insurance
  • long-term and short-term disability insurance
  • life insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service