Technical Lead Manager, Model Quality - Claude Code

Anthropic•San Francisco, NY

7d•$1 - $2•Hybrid

About The Position

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. We're looking for a Technical Lead Manager to build and lead the Model Quality engineering team within Claude Code. This team sits at the intersection of engineering and research, building the eval systems, data pipelines, and experimentation infrastructure that tell us where Claude's coding capabilities excel and where they fall short, and then closing those gaps. As TLM, you'll be hands-on, setting technical direction, reviewing designs, and shipping code alongside your team — while also hiring, coaching, and growing a group of strong senior engineers who thrive in ambiguous, high-intensity environments. You'll be the connective tissue between Claude Code product priorities and Anthropic's research org, ensuring the team is building infrastructure that actually accelerates our research loop.

Requirements

Have led engineering teams (as a manager or tech lead) building complex infrastructure — data platforms, ML tooling, eval systems, or research computing
Are a strong IC engineer in your own right and want to stay technical
Have operated in high-intensity, fast-iteration environments and know how to keep a team moving without burning out
Are comfortable navigating ambiguity across organizational boundaries — you know how to align teams with different incentives on shared goals
Are a power user of agentic coding tools and have real intuition for where models are strong and where they break
Care deeply about correctness and reliability, and can instill that bar in a team
Have 8+ years of engineering experience, including 2+ leading teams
We require at least a Bachelor's degree in a related field or equivalent experience.

Nice To Haves

Built or maintained evaluation frameworks for ML systems
Experience with reinforcement learning infrastructure
A background in research computing, scientific infrastructure, or ranking and recommendation systems
Experience with production ML monitoring and observability
A strong quantitative foundation

Responsibilities

You'll own the technical roadmap for model quality infrastructure on Claude Code, including eval frameworks, experimentation tooling, data pipelines.
You will be accountable for the reliability and correctness of systems that researchers depend on daily.
You'll hire and support a team of engineers and you'll partner closely with research leadership to translate open questions into engineering priorities, and with Claude Code product to ensure capability improvements show up in the product.
And you'll stay close to the code!