OpenAIposted 15 days ago
Full-time • Mid Level
Hybrid • San Francisco, CA
Publishing Industries

About the position

We are looking for a full-stack engineer with experience in observability, tooling, and data pipelines to capture, aggregate, and surface production signals about human-model interactions. You will build systems to understand and act on (1) how users engage with our models and (2) where our models fall short, and develop robust evaluations to define and track improvements in model behavior. These signals power our understanding of real-world model performance and can be captured in creative ways beyond standard logging and metrics, requiring both technical skill and product intuition. This role requires working across the stack - from building front-ends to surface & visualize insights to debugging back-end pipelines - and collaborating with cross-functional teams to ship iteratively under tight deadlines. You should thrive in scrappy environments, quickly prototype solutions, and care deeply about end-user experience and aesthetics.

Responsibilities

  • Do the highest-leverage work to improve models for users at scale
  • Build systems to understand and act on how users engage with our models and where our models fall short
  • Develop robust evaluations to define and track improvements in model behavior
  • Rapidly prototype and develop tooling, dashboards, and visualizations for researchers and applied teams
  • Design, implement, test, and debug code across the research and product stack
  • Build and maintain robust telemetry, logging, and data pipelines to support production-scale model evaluation
  • Collaborate across research, safety, infrastructure, and product teams to deliver solutions that improve model efficiency and user experience
  • Own and support experiments that validate hypotheses around model behavior

Requirements

  • Experience building and maintaining full-stack observability tooling
  • Experience building evaluations for capability and model improvements
  • Ability to ship quickly under competing priorities and tight deadlines
  • Understanding of how evaluations work and curiosity about model training and iteration
  • Care about product polish, usability, and interface aesthetics
  • Ability to collaborate effectively across teams and take on diverse tasks to move work forward
  • Team player, willing to do a variety of tasks that move the team forward

Nice-to-haves

  • Experience owning 0-1 user-facing products or tools, ideally in a startup or fast-moving environment
  • Bonus: understand AI/ML workloads and have experience building evaluation systems for them

Benefits

  • Relocation assistance to new employees
  • Hybrid work model of three days in the office per week
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service