AMD is looking for a performance-obsessed engineer to drive AI inference performance to the absolute limit on AMD GPUs. You will lead a small, highly technical team and work end-to-end across the stack: profiling, diagnosing, and optimizing leading models on customer-relevant serving configurations (e.g. agentic coding, long-context, high-throughput serving). You move from challenge to challenge, tackling the hardest performance problems across our most strategic customer engagements and leaving behind measurable uplifts and reusable methodology. This is not a sustaining role: every engagement is different, every optimization leaves a lasting impact. THE PERSON: You can take any AI workload, understand it top to bottom, and make it faster. You are equally comfortable profiling a distributed serving deployment, diagnosing a kernel-level bottleneck, and presenting optimization results to a customer's VP of Engineering. You understand GPU kernel performance deeply: not just how to use profiling tools, but how to reason about occupancy, cache behavior, memory coalescing, and instruction-level bottlenecks from first principles. You lead through technical depth: you set the standard for your team by doing the hardest work yourself and pulling others up along the way. You are AI-fluent, not just in the models you optimize, but in how you work: you leverage AI agents and tools daily to accelerate your workflows, and you actively define new ways of using them to make yourself and your team more effective. You thrive under pressure, move fast, and measure everything.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Principal