Sr. Staff Software Engineer, AI Infra

LinkedIn-posted 14 days ago

$180,000 - $300,000/Yr

Full-time • Senior

Mountain View, CA

Administrative and Support Services

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

LinkedIn is the world's largest professional network, built to help members of all backgrounds and experiences achieve more in their careers. Our vision is to create economic opportunity for every member of the global workforce. Every day our members use our products to make connections, discover opportunities, build skills and gain insights. We believe amazing things happen when we work together in an environment where everyone feels a true sense of belonging, and that what matters most in a candidate is having the skills needed to succeed. It inspires us to invest in our talent and support career growth. Join us to challenge yourself with work that matters. At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team. Join us to push the boundaries of scaling large models together. The team is responsible for scaling LinkedIn's AI model training, feature engineering and serving with hundreds of billions of parameters models and large scale feature engineering infra for all AI use cases from recommendation models, large language models, to computer vision models. We optimize performance across algorithms, AI frameworks, data infra, compute software, and hardware to harness the power of our GPU fleet with thousands of latest GPU cards. The team also works closely with the open source community and has many open source committers (TensorFlow, Horovod, Ray, vLLM, Hugginface, DeepSpeed etc.) in the team. Additionally, this team focussed on technologies like LLMs, GNNs, Incremental Learning, Online Learning and Serving performance optimizations across billions of user queries.

Owning the technical strategy for broad or complex requirements with insightful and forward-looking approaches that go beyond the direct team and solve large open-ended problems.
Designing, implementing, and optimizing the performance of large-scale distributed serving or training for personalized recommendation as well as large language models.
Improving the observability and understandability of various systems with a focus on improving developer productivity and system sustenance.
Mentoring other engineers, defining our challenging technical culture, and helping to build a fast-growing team.
Working closely with the open-source community to participate and influence cutting edge open-source projects (e.g., vLLMs, PyTorch, GNNs, DeepSpeed, Huggingface, etc.).
Functioning as the tech-lead for several concurrent key initiatives AI Infrastructure and defining the future of AI Platforms.

BS/BA in Computer Science or related technical field or equivalent technical experience
5+ years of industry experience in software design, development, and algorithm related solutions
5+ years of experience programming in object-oriented languages such as Python, C++, Java, Go, Rust, Scala
2+ years of experience as an architect, or technical leadership position
5+ years of experience in the industry with leading / building deep learning systems
Hands-on experience developing distributed systems or other large-scale systems

MS or PhD in Computer Science or related technical discipline.
10+ years of experience in software design, development, and algorithm related solutions with at least 5 years of experience in a technical leadership position
10+ years of experience in an object-oriented programming language such as Python, C++, Java, Go, Rust, Scala
5+ years of experience with large-scale distributed systems and client-server architectures
Experience building ML applications, LLM serving, GPU serving.
Co-author or maintainer of any open-source projects
Expertise in machine learning infrastructure, including technologies like MLFlow, Kubeflow and large scale distributed systems
Expertise in deep learning frameworks and tensor libraries like PyTorch, Tensorflow, JAX/FLAX

The pay range for this role is $180,000 to $300,000.
Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location.
The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans.

Track Jobs with Teal

Job Search Resources

•

AI Resume Builder

•

AI Engineer Resume Examples

•

AI Engineer Cover Letter Examples

Sr. Staff Software Engineer, AI Infra

Job Search Resources

Tools

Career Hubs

Guides

Company