About The Position

We are looking for a Technical Program Manager, Quality and Reliability to champion cross-functional initiatives that enhance the quality, reliability, and operational excellence at Harvey. You will be the ultimate owner of Harvey’s product quality. You will collaborate closely with Engineering, Product, and Operation teams to identify and address critical gaps in test coverage, release safety, incident management, and system observability. You will discover reliability risks and partner with Platform teams to drive broad improvements, focusing on building repeatable, scalable reliability guardrails as Harvey navigates hyper-growth.

Requirements

  • 5+ years of experience in technical program management or release management, ideally within SaaS or fast-moving tech companies
  • Prior Experience working as Software QA or Test Engineer, preferably with SaaS products.
  • Strong understanding of engineering workflows, including CI/CD, release cycles, and infrastructure planning.
  • Experience partnering with engineering and product leadership to achieve cross-team quality and reliability objectives.
  • Excellent communication skills—you can distill complexity into clarity for both technical and non-technical audiences.
  • A track record of building systems and processes that scale with growth.
  • Comfort in ambiguity and eagerness to build structure where there is none.
  • Bachelor’s degree in Computer Science, Engineering, or related technical field.

Nice To Haves

  • Familiarity with incident management tooling (PagerDuty, Incident.io), monitoring stacks (Datadog, Prometheus, Grafana), and test automation frameworks (Playwright, Cypress, Selenium)

Responsibilities

  • Own release management end-to-end, ensuring on-time, high-quality product releases through coordination across all teams in a fast-paced environment.
  • Introduce and enforce change safety standards, such as risk assessments, rollback procedures, feature flag, and bug bashes to reduce regressions and customer impact.
  • Lead horizontal reliability initiatives focused on improving test coverage, observability, and incident response readiness.
  • Define, measure, and report on reliability metrics (e.g. change failure rate, MTTR, SLI), and drive accountability for sustained improvement.
  • Identify systemic gaps in release processes, testing, monitoring, and incident response; convert findings into structured improvement plans with clear owners and timelines.
  • Drive rapid triage and resolution of customer-reported issues in partnership with Product and User Operations, ensuring timely followup and continuous improvements.
  • Own and improve the incident management lifecycle: Facilitate rigorous post-incident reviews, ensuring root causes are identified and corrective and preventative actions are tracked to completion..
  • Oversee vendor reliability and SLA compliance, including performance monitoring, incident escalation, and periodic business reviews.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service