Senior software engineer, Machine Learning, Core ML

Google•Mountain View, CA

7h•$166,000 - $244,000

About The Position

We are the RecML team in Core ML's Applied ML organization. Our mission is to accelerate product innovations through ML for recommendations and user modeling. We deeply engage with Alphabet products areas and partner with them to help accelerate product innovations through applied research in recommendations and user modeling. We generalize successful innovations into standardized, maintainable, and production-grade solutions for use by other teams and products. This opportunity is a horizontal ML infra and efficiency role supporting the training framework of our foundation recommender model and its customers. The US base salary range for this full-time position is $166,000-$244,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google [https://careers.google.com/benefits/].

Requirements

Bachelor’s degree or equivalent practical experience.
5 years of experience programming in Python or C++.
3 years of experience with ML infrastructure (e.g., model deployment, model evaluation, optimization, data processing, debugging).

Nice To Haves

Master’s degree or PhD in Computer Science, Machine Learning, Computer Engineering, or a related technical field.
Experience scaling machine learning models (e.g., Large Language Models (LLMs) or foundation models), managing the complexities of transitioning architectures from data-parallel to model, tensor, pipeline-parallel configurations, or related fields.
Experience with deep learning frameworks (e.g., JAX, PyTorch, or TensorFlow), including a track record of contributing to or modifying their core internals to support novel and emerging use cases.
Experience with co-designing hardware-aware optimizations to accelerate model execution.
Knowledge of machine learning compilers (e.g., Accelerated Linear Algebra (XLA) or Multi-Level Intermediate Representation (MLIR)).

Responsibilities

Architect and implement the transition from data-parallel to model-parallel training paradigms.
Design and manage large-scale training runs across multi-pod environments, maximizing data center network bandwidth and minimizing communication bottlenecks.
Research and integrate transformer model optimizations and novel architectural variants to reduce training time and resource consumption.
Write and optimize low-level model code, including custom pallas kernels, to maximize performance out of the hardware.
Work cross-functionally with the team and the Kernel optimization team to co-design and implement compiler-level optimizations that accelerate model execution.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume