Applied Scientist

Amazon•Seattle, WA

About The Position

At Amazon Selection and Catalog Systems (ASCS), our mission is to power the online buying experience for customers worldwide so they can find, discover, and buy any product they want. We innovate on behalf of our customers to ensure uniqueness and consistency of product identity and to infer relationships between products in Amazon Catalog to drive the selection gateway for the search and browse experiences on the website. We're solving a fundamental AI challenge: establishing product identity and relationships at unprecedented scale. Using Generative AI, Visual Language Models (VLMs), and multimodal reasoning, we determine what makes each product unique and how products relate to one another across Amazon's catalog. The scale is staggering: billions of products, petabytes of multimodal data, millions of sellers, dozens of languages, and infinite product diversity—from electronics to groceries to digital content. The research challenges are immense. GenAI and VLMs hold transformative promise for catalog understanding, but we operate where traditional methods fail: ambiguous problem spaces, incomplete and noisy data, inherent uncertainty, reasoning across both images and textual data, and explaining decisions at scale. Establishing product identities and groupings requires sophisticated models that reason across text, images, and structured data—while maintaining accuracy and trust for high-stakes business decisions affecting millions of customers daily. Amazon's Item and Relationship Platform group is looking for an innovative and customer-focused applied scientist to help us make the world's best product catalog even better. In this role, you will partner with technology and business leaders to build new state-of-the-art algorithms, models, and services to infer product-to-product relationships that matter to our customers. You will pioneer advanced GenAI solutions that power next-generation agentic shopping experiences, working in a collaborative environment where you can experiment with massive data from the world's largest product catalog, tackle problems at the frontier of AI research, rapidly implement and deploy your algorithmic ideas at scale, across millions of customers.

Requirements

PhD, or Master's degree and 4+ years of CS, CE, ML or related field experience
Experience programming in Java, C++, Python or related language
2+ years of building machine learning models or developing algorithms for business application experience

Nice To Haves

Experience with LLMs, VLMs, foundation models, or large-scale deep learning systems—including multimodal pretraining, fine-tuning, RLHF, prompt engineering, or agentic architectures
Experience with LLM/VLM serving optimization, including model distillation, quantization, pruning, speculative decoding, or other model compression and efficient inference techniques
Experience with explainable AI, model interpretability, or uncertainty quantification
Strong experimental design skills and statistical analysis expertise
Track record of deploying ML models at scale in production environments processing billions of data points
Publications in top-tier venues such as NeurIPS, ICML, ICLR, CVPR, ICCV, EMNLP, ACL, NAACL, COLING, KDD, SIGMOD, WWW, AAAI, or similar
Excellent written, verbal communication & data presentation skills

Responsibilities

Formulate open research problems at the intersection of GenAI, multimodal reasoning, and large-scale information retrieval—defining the scientific questions that transform ambiguous, real-world catalog challenges into publishable, high-impact research
Push the boundaries of VLMs, foundation models, and agentic architectures by designing novel approaches to product identity, relationship inference, and catalog understanding—where the problem complexity (billions of products, multimodal signals, inherent ambiguity) demands methods that don't yet exist
Advance the science of efficient model deployment—developing distillation, compression, and LLM/VLM serving optimization strategies that preserve frontier-level multimodal reasoning in compact, production-grade architectures while dramatically reducing latency, cost, and infrastructure footprint at billion-product scale
Make frontier models reliable—advancing uncertainty calibration, confidence estimation, and interpretability methods so that frontier-scale GenAI systems can be trusted for autonomous catalog decisions impacting millions of customers daily
Own the full research lifecycle from problem formulation through production deployment—designing rigorous experiments over petabytes of multimodal data, iterating on ideas rapidly, and seeing your research directly improve the shopping experience for hundreds of millions of customers
Shape the team's research vision by defining technical roadmaps that balance foundational scientific inquiry with measurable product impact
Mentor scientists and engineers on advanced ML techniques, experimental design, and scientific rigor—building deep organizational capability in GenAI and multimodal AI
Represent the team in the broader science community—publishing findings, delivering tech talks, and staying at the forefront of GenAI, VLM, and agentic system research

Benefits

health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
401(k) matching
paid time off
parental leave

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume