Senior Applied Scientist

Zillow

1d•Remote

About The Position

Zillow’s AI organization builds the intelligence that powers Zillow’s mission to make home a reality for more people - across buying, selling, renting, and financing. The Partner Communications AI team is at the forefront of this mission, transforming how real estate agents, loan officers, and customers connect. We go beyond simple conversational interfaces to build integrated, intelligent communication ecosystems. Using modern NLP and Generative AI, we personalize high-stakes messaging, automate communication workflows, and surface real-time insights across multi-channel communication streams and diverse data. We operate at the intersection of Large Language Models (LLMs), agentic AI, classical ML, and complex multi‑channel data, collaborating deeply with product, engineering, analytics, operations, and compliance to ensure our AI solutions are innovative, trustworthy, unbiased, and measurable. As a Senior Applied Scientist, you will be a hands-on technical leader and innovator for our partner communications AI systems. You will design novel models, reasoning systems, and evaluation frameworks that facilitate and enhance how real estate professionals and customers connect and interact. Your work will power a wide range of communication experiences - from context-aware personalized message creation to the next generation of intelligent agentic AI assistants that streamline professional workflows. You will own the science lifecycle end‑to-end: breaking down ambiguous problems, exploring messy multi‑channel and CRM data, training and fine‑tuning models, and defining how we measure success, quality, safety, and impact for GenAI at Zillow.

Requirements

Proven impact. 5+ years of experience applying advanced ML/AI to real-world problems. You have a track record of leading the scientific lifecycle - from translating ambiguous business objectives into concrete metrics and modeling roadmaps to partnering with engineering to see those solutions through to launch.
Deep NLP & GenAI foundation. Strong understanding of modern NLP, agentic, and LLM architectures/approaches, with practical experience delivering conversational or language-based systems.
Evaluation-first mindset. You focus on measurement. You have experience designing and analyzing experiments (for example, A/B tests or similar online/offline experiments), plus model- or LLM-based evaluation, and tying scientific metrics to clear business or customer KPIs.
Robust engineering skills. Proficient in Python and modern ML stacks (e.g., PyTorch, Transformers). You believe in clean code, rigorous testing, and version control.
Effective communicator. You can translate complex modeling trade-offs into clear narratives for partners in product and engineering to build alignment.

Nice To Haves

Conversational AI experience. Experience building language systems for applications with specific conversational or structural constraints - where the AI must adhere to complex logic, business rules, or specific communication flows.
Agentic / RL / fine-tuning experience. Experience with agentic LLM frameworks or tool-using assistants. Exposure to reinforcement learning or fine-tuning (e.g., RLHF, RLAIF, task-specific SFT) to steer and improve assistant behavior over time.
Advanced education. Doctoral degree in a STEM field and/or publications in machine learning or AI-related conferences and journals.
Technical leadership. Demonstrated ability to mentor junior scientists and engineers and contribute to shared libraries, best practices, or the broader AI community.

Responsibilities

Drive the science of communication intelligence.
Own the modeling strategy for intelligent tools that enhance partner workflows, including messaging, summarization, and workflow automation.
Drive R&D and innovation.
Conduct applied research to design novel architectures and reasoning systems.
Define evaluation and safety for GenAI.
Design robust offline and online evaluation, including experiments, hallucination/error detection, and guardrails so systems are helpful, honest, and aligned with business and policy constraints.
Master messy, multi‑channel data.
Work with large, messy real‑world data (call transcripts, SMS, email, CRM events, funnel behavior, and more) using big data tools (e.g., Databricks, SQL, Spark) to explore, transform, and productionalize features.
Mentor and lead through Influence.
Partner with product and engineering to turn research into business impact, while mentoring scientists, engineers, and PhD interns.
You will elevate the scientific craft of the broader organization, with opportunities to drive novel research and external publications.