Staff AI Systems Engineer

Flock Safety
7d$200,000 - $225,000Hybrid

About The Position

We’re hiring a Staff. AI Systems Engineer to help support our emerging product, Night Shift, an AI research assistant that amplifies the impact of investigators by automating the tedious, repetitive steps involved in working a case. This role sits within the Machine Learning team and will work closely with partners in Engineering (Backend, Frontend, and Design) in a fast-paced environment. You will be one of the earliest technical contributors to our system architecture for agentic AI, and will own our AI evaluation framework. The outcome we’re after is clear and ambitious: measurably faster, more accurate leads for every officer and every shift.

Requirements

  • Familiarity with Agentic Systems: Hands-on experience with LLM agents including:
  • LLM API use (e.g. LangChain/LangGraph, vLLM, OpenAI/Gemini/Anthropic APIs)
  • Agent Design: tool use (e.g. via MCP), retrieval, memory, grounding/attribution for claims, and guardrails.
  • Architectural patterns: planning and hand-off for multi-agent systems, context management
  • RAG: vector/hybrid search (e.g. pgvector, turbopuffer, rerankers, etc.)
  • ML Platform expertise: 5+ years building and shipping ML systems to production; experience in the following areas:
  • Backend Python and JS familiarity required; Typescript/Golang familiarity welcome
  • Web services (e.g. Express/FastAPI, REST, SSE, JWTs)
  • Cloud Infrastructure (e.g. AWS, Terraform, VPC, Networking)
  • Backend databases/stores (e.g. Postgres, Redis)
  • Observability (e.g. Prometheus, Grafana, OpenTelemetry, LangSmith/Langfuse)
  • Experience with LLM Evaluations at scale: You’ve built offline/online eval harnesses and are familiar with the methodologies and metrics to measure:
  • Search, retrieval, and recommendation performance
  • Safety & robustness (security, compliance, red-teaming, regression testing)
  • Cost, performance and latency trade-offs

Nice To Haves

  • Durable execution (e.g. Temporal, Hatchet)
  • OLAP (e.g. ClickHouse, Bigquery)
  • ML Inference (e.g. PyTorch, TensorRT, NVIDIA Triton), ideally in multimodal domains (text/image/video)
  • Compute orchestration (e.g. Kubernetes, Prefect, Ray)
  • Agentic task success, trajectory quality, preference learning (e.g. SFT, DPO, RLHF, LLM-as-judge)

Responsibilities

  • Deliver the MVP evaluation harness to produce initial metrics, enable debugging and perform regression testing.
  • Take on a system feature that offers demonstrated improvement against your MVP evaluation suite
  • Productionize the evaluation and observability platform and make it the source of truth for quality and safety. (e.g. Online/offline tracing, alerting, dashboards, evaluations and PR-gated regression suite)
  • Own the roadmap for evolving the agent evaluation platform
  • Lead deeper R&D threads (e.g., lightweight fine-tuned projection layers, specialized embeddings, multimodal understanding) that can improve system performance on core metrics.

Benefits

  • Flexible PTO: We offer non-accrual PTO, plus 11 company holidays.
  • Fully-paid health benefits plan for employees: including Medical, Dental, and Vision and an HSA match.
  • Family Leave: All employees receive 12 weeks of 100% paid parental leave. Birthing parents are eligible for an additional 6-8 weeks of physical recovery time.
  • Fertility & Family Benefits: We have partnered with Maven, a complete digital health benefit for starting and raising a family. Flock will provide a $50,000-lifetime maximum benefit related to eligible adoption, surrogacy, or fertility expenses.
  • Spring Health: Spring Health offers a variety of mental health benefits, including therapy, coaching, medication management, and digital tools, all tailored to each individual's needs.
  • Caregiver Support: We have partnered with Cariloop to provide our employees with caregiver support
  • Carta Tax Advisor: Employees receive 1:1 sessions with Equity Tax Advisors who can address individual grants, model tax scenarios, and answer general questions.
  • ERGs: We want all employees to thrive and feel like they belong at Flock. We offer four ERGs today - Women of Flock, Flock Proud, LEOs and Melanin Motion. If you are interested in talking to a representative from one of these, please let your recruiter know.
  • WFH Stipend: $150 per month to cover the costs of working from home.
  • Productivity Stipend: $300 per year to use on Audible, Calm, Masterclass, Duolingo and so much more.
  • Home Office Stipend: A one-time $750 to help you create your dream office.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service