Research & Development Intern-LLM, Summer 2026

The Walt Disney CompanyGlendale, CA
15h$47 - $47

About The Position

About our Program: Magic is Within You Walt Disney Imagineering (WDI) is the master planning, creative development, design, engineering, production, project management, and research arm of The Walt Disney Company’s (TWDC) Parks and Resorts business segment. Representing more than 150 disciplines, the talented corps of Imagineers are responsible for the creation of Disney resorts, theme parks and attractions, hotels, water parks, real estate developments, regional entertainment venues, cruise ships and new media technology projects. As part of The Walt Disney Company, Disney Research builds upon a rich legacy of innovation and technology leadership in the entertainment industry that continues to this day. One area of work focuses on building generative AI-powered tools for Imagineers that will make their work more efficient and impactful. Our current project investigates the capabilities of large language models and agent workflows. We’re seeking a lab associate who can help explore the evaluation of such systems for Imagineering teams. We’re particularly interested in benchmarks and evaluation metrics for conversational agents in which the success of the agent-user interaction is based on a subjective judgement. Our work environment is research-focused and prototype-driven: we are constantly looking for the thing we can test out that will give us the most insight into our biggest unknowns. A Reseach and Development Internship generally runs three to four months, although exceptions are made when necessary.

Requirements

  • Experience with LLMs
  • Experience with agentic systems
  • Experience with LLM evaluation metrics
  • Python coding skills
  • Previous experience with experiment design and data analysis

Nice To Haves

  • Experience with Pytorch

Responsibilities

  • Help design and develop custom evaluation metrics for LLM and agentic systems
  • Verify those new metrics on existing datasets
  • Assist in design and setup of experimental data collection efforts for new datasets
  • Validate human perception of said custom metrics, using existing datasets or designing playtests to collect perceptions.
  • Document and develop software for running the custom eval metric(s) in larger systems
  • Understand and engage with the broad technical and creative goals of the project
  • Help define milestones that solve important technical issues and convey the potential impact of the project
  • Engage in team brainstorms and troubleshooting
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service