Senior Machine Learning Engineer- Computer Vision

Warner Bros. DiscoverySan Francisco, CA
2d

About The Position

At Warner Bros. Discovery, we are reimagining how machine learning transforms storytelling. As part of the AI/ML organization, focusing on supporting applications of AI to video, the Machine Learning Engineer – Services group powers infrastructure and backend services behind production workflows. We're looking for an experienced ML Engineer with strong fundamentals and infrastructure experience to help build reusable components and services for video understanding, video summary, and video classifications. You will be part of a team focused on re-training, model hosting, cost optimization, and managing production workflows at scale.

Requirements

  • 5+ years of experience in machine learning engineering, with end-to-end ML workflow expertise
  • Strong background in model retraining, fine-tuning, and evaluation techniques
  • Experience deploying and managing open-source model servers (e.g., Triton, TorchServe, Ray Serve)
  • Proficient in managing cost-effective distributed computing environments (e.g., Kubernetes, Ray, SageMaker)
  • Familiar with experiment tracking tools (e.g., MLflow, Weights & Biases) and model versioning strategies
  • Deep understanding of ML domains including NLP, RecSys, and reinforcement learning
  • Familiarity with labeling tools, HITL workflows, and offline data curation strategies
  • Comfort working in Agile development environments and collaborating across global teams

Nice To Haves

  • Experience with real-time inference systems and streaming data pipelines is a plus

Responsibilities

  • Build and maintain pipelines for model fine-tuning and retraining, including LoRA-based workflows and Large Language Models
  • Integrate and maintain vector search services and semantic similarity infrastructure
  • Design scalable model serving solutions for open-source and foundation models
  • Develop systems for experiment tracking, model versioning, and evaluation
  • Monitor production models for drift and performance degradation
  • Manage compute cost and resource optimization across distributed training jobs
  • Integrate Human-in-the-Loop (HITL) workflows and offline labeling into training pipelines
  • Support model deployment for varied model architectures, including Vision-Language Models, Convolutional Neural Nets, and Embedding Generation models
  • Stand up and maintain Feature Store and data versioning infrastructure
  • Architect and implement RAG pipelines for video metadata, summarization, and Q&A
  • Build evaluation frameworks to assess LLM performance, hallucination frequency, and structured response accuracy

Benefits

  • health insurance coverage
  • employee wellness program
  • life and disability insurance
  • a retirement savings plan
  • paid holidays and sick time and vacation
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service