Principal Software Development Engineer – AI Frameworks & GPU Performance

Advanced Micro Devices, Inc•Santa Clara, CA

2d•Hybrid

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. Are you ready to shape the future of AI software on the world's fastest supercomputers and most advanced data centers? At AMD, we are seeking passionate and talented Software Engineers to join our team. In this role, you will be at the forefront of developing cutting-edge technology that enhances performance and efficiency for the next generation of GPU accelerators. This is your chance to contribute to open-source AI software from AMD and the broader community, driving innovation and optimizing AI performance across data center GPUs. As part of our team, you will have a unique opportunity to work with industry-leading clients, leveraging the latest hardware capabilities for AI workloads. You will be among the first to integrate new hardware with the latest applications, libraries, frameworks, and SDKs, solving complex challenges and pushing the boundaries of AI technology. Join us in our mission to enable and optimize the software ecosystem for leading data centers and supercomputers, and be a key player in advancing the AI landscape. You are a skilled engineer with a deep passion for pushing the boundaries of AI technology. You excel in open-source environments and relish the opportunity to tackle complex technical challenges. Your commitment to writing efficient, maintainable, and scalable software is matched by your collaborative spirit and curiosity. You are eager to contribute to open-source repositories that drive the next generation of AI workloads, and you take pride in being part of a team that is at the forefront of AI innovation.

Requirements

Strong proficiency in C++ and GPU programming languages such as CUDA, HIP, or OpenCL.
Experience with AI training and inference frameworks and tools.
Proven track record of developing high-performance, scalable software solutions.

Nice To Haves

Familiarity with the ROCm platform and data science library stack is a plus.
Excellent problem-solving skills and the ability to work effectively in a team environment.

Responsibilities

Design, develop, and optimize libraries for AMD data center GPUs, focusing on the ROCm data science library stack.
Collaborate with cross-functional teams to integrate and validate new features and enhancements.
Conduct performance analysis and optimization to ensure high efficiency and scalability of developed solutions.
Stay updated with the latest advancements in AI, machine learning, and GPU programming to drive innovation within the team.
Mentor junior engineers and contribute to the continuous improvement of development processes and best practices.