About The Position

Meta is seeking a Research Engineering Manager to lead the Post-Training team within Meta Superintelligence Labs. High-quality data is the core of AI progress at MSL, fueling our frontier capabilities and determining how our models interact with the world. In this leadership role, you will guide a team of research engineers who build the full-stack infrastructure to collect, generate, and refine data for our most advanced AI models. You'll partner with world-class researchers and engineers to define the strategic vision for our data engine, ensuring your team delivers high-quality, scalable pipelines for both human-in-the-loop and synthetic data. This is a technical leadership role requiring research engineering expertise, people management skills, and the ability to drive execution on open-ended machine learning challenges. The data your team curates will directly impact the major model lines within MSL, making engineering reliability, data quality, and scalability paramount. You will excel by maintaining high velocity across your team while guiding them through a wide variety of challenges—from scaling expert data collection in domains like STEM, finance, legal, and health, to building novel environments for capturing advanced agentic tasks (Search, Deep research, computer use, coding, and UI generation). If you are passionate about defining the data strategies that drive AI progress, have a track record of building high-performing technical teams, and thrive in fast-paced research environments, we encourage you to apply for this leadership opportunity at the core of MSL.

Requirements

  • Bachelor's or Master's degree in Computer Science, Machine Learning, or a related technical field
  • 4+ years of experience in machine learning engineering, machine learning research, or a related technical role
  • 3+ years of experience managing or leading technical teams, including hiring, mentoring, and performance management
  • Proficiency in Python and experience with ML frameworks such as PyTorch
  • Proven track record of leading medium to large-scale technical projects (specifically data pipelines or ML infrastructure) from conception to deployment
  • Software engineering practices including version control, testing, code review, and system design
  • Demonstrated ability to balance hands-on technical work with people management and strategic planning
  • Great communication skills with the ability to influence cross-functional stakeholders

Nice To Haves

  • Publications at peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to deep learning, language models, or data-centric AI
  • Hands-on experience managing teams that build language model post-training pipelines (SFT/RLHF) or synthetic data generation engines
  • Experience building infrastructure for agentic workflows, tool-use data collection, or reinforcement learning environments
  • Experience building and scaling large-scale distributed systems and high-throughput data processing pipelines
  • Familiarity with data quality filtering, deduplication, and vendor management for ML data
  • Experience managing teams in fast-paced research or startup environments
  • PhD in Computer Science, Machine Learning, or related field

Responsibilities

  • Build, mentor, and grow a team of research engineers focused on full-stack post-training data infrastructure
  • Conduct performance reviews, career development conversations, and provide technical mentorship to team members
  • Foster a Culture of Engineering Excellence, data rigor, and rapid iteration within the team
  • Partner with recruiting to hire world-class research engineering talent
  • Oversee the development and scaling of data collection pipelines for high-value domains (STEM, GDP-valuable tasks, finance, legal, health) and complex agentic workflows (deep research, computer use, shopping agents)
  • Establish and manage partnerships with external data vendors to source and securely prepare expert-level post-training datasets
  • Influence the technical roadmap for data infrastructure in collaboration with the MSL Infra team
  • Translate the strategic vision of research scientists into actionable engineering plans for synthetic data generation, SFT, and RLHF pipelines
  • Partner with research scientists, product teams, and model training teams to align data collection priorities with organizational capability goals
  • Build robust, reusable data pipelines that can rapidly deliver high-quality datasets to multiple model lines
  • Drive the development of tooling that continuously monitors and improves the Quality, Diversity, and safety of our data
  • Communicate technical progress, challenges, and strategic data decisions to leadership
  • Maintain technical credibility through hands-on contributions to critical data infrastructure projects (20-30% of time)
  • Review code, provide technical guidance, and unblock complex scaling or data-processing challenges
  • Set engineering standards and best practices for the team

Benefits

  • bonus
  • equity
  • benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service