About The Position

Do you want to build the backbone of Generative AI cloud at AWS? Do you want to build the future of the cloud for AI training and inference? Want to do industry leading work delivering continuous price performance improvements in the cloud for AI model training for multi billion variable LLMs? Come Join us in designing, delivering and operating AWS cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.

Requirements

  • 3+ years of infrastructure architecture, database architecture and networking experience

Nice To Haves

  • Experience in computer architecture, or experience leading the architecture and design (architecture, design patterns, reliability and scaling) of new and current systems

Responsibilities

  • As a member of the Hardware Engineering Services team in this specific function, you will own and lead the design, development and root cause of a new segment of accelerated servers.
  • You will work closely with our customers to understand their technical needs and business goals, leveraging your experience with server design and the knowledge of various teams to architect the solutions that we will deploy at scale.
  • To deliver your products you will work with an interdisciplinary team of component, firmware, test, qualification, and integration engineers, and lead our design and manufacturing partners to bring these servers to the data center.
  • After launch you will oversee the fleet of servers you develop, monitoring their quality and how they are meeting the customer requirements.
  • Your day to day responsibilities will include interfacing with our internal and external customers to understand project requirements and facilitate system development ontop of your server design.
  • You will be responsible for learning operational challenges to our existing fleet with the goal of improving the current customer experience as well as developing improved systems for future designs.
  • You will work directly with vendors and ODM/JDM design teams to develop and manufacture your product at scale.

Benefits

  • Amazon package will include sign-on payments and restricted stock units (RSUs).
  • Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service