We are looking for a full-stack engineer with experience in observability, tooling, and data pipelines to capture, aggregate, and surface production signals about human-model interactions. You will build systems to understand and act on (1) how users engage with our models and (2) where our models fall short, and develop robust evaluations to define and track improvements in model behavior. These signals power our understanding of real-world model performance and can be captured in creative ways beyond standard logging and metrics, requiring both technical skill and product intuition. This role requires working across the stack - from building front-ends to surface & visualize insights to debugging back-end pipelines - and collaborating with cross-functional teams to ship iteratively under tight deadlines. You should thrive in scrappy environments, quickly prototype solutions, and care deeply about end-user experience and aesthetics.