Site Reliability Engineer

ConductorOnePortland, OR
3d

About The Position

Do you remember the first time you got paged at 3am for an incident—and your runbook actually worked? Or the moment you watched traffic seamlessly failover to a secondary region while you sipped your coffee? There's a particular satisfaction in building systems so reliable that the most exciting thing about on-call is how boring it is. If you've ever felt the quiet pride of deleting an alert because you automated away the underlying problem, or the calm confidence of watching your system absorb a traffic spike without breaking a sweat, you know what we're talking about. At ConductorOne, we know that the world has been transformed yet again and we never want to go back. We're building AI tools and we're building with AI tools—every day. We ship fast and we ship constantly. The hard problems are still hard, but the pace at which we can attack them has changed forever. We are looking for an exceptional Site Reliability Engineer to join us in Portland. You'll own the reliability, scalability, and operational excellence of a platform that enterprises trust with their identity security. This is a senior technical role where you'll build the observability, automation, and infrastructure that lets our engineering teams ship with confidence. If you bias toward action and care deeply about building systems that just work, we want to talk to you. Here are a few blog posts highlighting how we approach hard problems: Incremental Sync: How ConductorOne Keeps Identity Data Fresh in Real Time Building Trust Through Connector Reliability About ConductorOne ConductorOne is the first AI-native identity security platform that protects every identity: human, non-human, and AI. We make it possible for enterprises to move beyond the limitations of legacy identity governance—reducing their attack surface while actually improving the experience for everyone involved. Forward-thinking companies like DigitalOcean, Instacart, Ramp, and Zscaler trust ConductorOne to secure their identities. We're backed by top investors and growing fast, but we're still small enough that every engineer shapes the product and culture.

Requirements

  • A track record of building and operating production systems at scale—you've kept real systems running for real customers
  • Deep experience with cloud infrastructure (AWS, GCP, or similar) and infrastructure-as-code (Terraform, Pulumi, or similar)
  • Strong programming skills in Go, Python, or similar—you write tools and automation, not just scripts
  • Experience with Kubernetes and container orchestration in production
  • Strong systems thinking—you understand how components interact and where failures cascade
  • High agency—you figure out what needs to be built, not just how to build what you're told. You move fast and unblock yourself.
  • Deep understanding of observability—you know how to instrument systems, build dashboards, and create alerts that actually matter
  • Experience with AI-assisted development (Claude Code, Cursor, Copilot, or similar)—you're already using these tools and excited about what's next
  • Clear, persuasive communication—you can explain complex systems to diverse audiences and drive alignment during incidents
  • Ego in check—you care about getting it right, not being right

Nice To Haves

  • Experience with AI/ML infrastructure or serving LLMs in production
  • Background in security-focused environments or compliance frameworks (SOC 2, FedRAMP, etc.)
  • Experience building developer platforms or internal tooling
  • Familiarity with identity systems and protocols (SCIM, SAML, OAuth, LDAP)
  • Contributions to open source projects or engineering communities

Responsibilities

  • Own the reliability and scalability of our platform—design, build, and operate the infrastructure that keeps ConductorOne running for customers who depend on us
  • Build observability that drives action—create monitoring, alerting, and tooling that helps teams understand system behavior and respond to incidents quickly
  • Drive operational excellence across engineering—partner with product teams to ensure new features are built with reliability in mind from the start
  • Automate relentlessly—if you're doing something twice, build a system to do it for you. We believe in infrastructure as code and eliminating toil.
  • Respond to and learn from incidents—lead incident response, conduct blameless postmortems, and drive systemic improvements
  • Plan and execute infrastructure projects with incremental deliverables—you'll assess technical risks, communicate tradeoffs, and ship iteratively
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service