Amazonposted 16 days ago
Santa Clara, CA

About the position

AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Amazon AI is looking for a software engineer to be part of Hyperpod Engines team building a resilient platform for deep learning training. Amazon Sagemaker Hyperpod scales and accelerates generative AI model development across thousands of AI accelerators. As a part of Hyperpod Engines team you will be developing training frameworks and communication libraries. You will be working on training frameworks like Pytorch, Nemo, Megatron, and collective communications libraries like NCCL. you will be developing software to train and fine tune large language models like LLAMA. As part of the team, you will be working in a fast-paced, cross-disciplinary team of engineers and researchers who are leaders in the field. You will take on challenging problems, distill real requirements, and then deliver solutions that either leverage existing academic and industrial research, or utilize your own out-of-the-box pragmatic thinking. In addition to coming up with novel solutions and prototypes, you will deliver these to production in customer facing products.

Responsibilities

  • Creating and modifying a large or significant set of components, a mid-size application, or service.
  • Developing model training optimizations like context parallel, pipeline parallel and tensor parallel.

Benefits

  • Work-life harmony
  • Mentorship & Career Growth
  • Inclusive Team Culture
  • Diverse Experiences
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service