Novel Testing is a team within Trust & Safety specializing in complex testing, defining protocols and methodologies for assessing risk where best practices do not currently exist. We pioneer and scale innovative testing programs, streamlining the launch of trustworthy, novel, responsible AI (RAI) products. Work spans from designing first-of-their-kind evaluations for Google’s most ambitious product bets—including autonomous agents, personalization, and the latest hardware—to developing new methodologies for assessing novel foundational model capabilities as they emerge. Advancing the state-of-the-art in AI evaluation is central to this mission. To scale these methods, we partner closely with engineering teams to build the innovative infrastructure and tools required for automated, rigorous evaluation. You will lead the development of novel testing methodologies for emergent AI, requiring the methodological precision to design evaluation frameworks where established standards do not yet exist. You’ll address complex data science questions with creative experimentation, designing sophisticated prompt strategies and quantitative analyses to identify systemic risks and edge cases in GenAI products. Bridging the gap between theory and execution, you will build and prototype testing solutions that incorporate data science best practices. You will then partner directly with engineering teams to inform the development of automated infrastructure, ensuring your insights scale effectively across Google’s ecosystem. You will utilize a researcher’s mindset—capable of deep qualitative and quantitative inquiry—paired with technical agility to translate those findings into scalable, high-impact engineering prototypes.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level