DevOps / Site Reliability Engineer

General MatterLos Angeles, CA
12d$100,000 - $200,000

About The Position

About the Company General Matter is enriching uranium in America. Our mission is to restore our country’s ability to make nuclear fuel. Our fuel will help power AI, manufacturing, and other critical industries. It will power our next generation of reactors. Ultimately, it will power our national ambitions. We were incubated by Founders Fund, like Anduril and Palantir before us, and we are backed by top tier investors. Our lean, world-class team of engineers and operators is applying a first-principles approach to solving the problem of nuclear fuel production. We are a mission-driven company with a culture of urgency, accountability and transparency. About This Role As a General Matter Embedded Software Engineer, you will develop performant, safe and high-quality software to directly control our systems. Your code will be responsible for commanding actuators and processing high-speed signals in applications where safety and accuracy are exceedingly important. You will work closely with cross-functional teams, including electrical engineers, software engineers, chemical engineers, manufacturing engineers, nuclear engineers, materials scientists and physicists. If you seek high-impact and are excited by fast-paced, intense, Skunkworks-style projects, we encourage you to reach out to join our team. DevOps / Site Reliability Engineer We are seeking a highly capable DevOps / Site Reliability Engineer to help build and operate the software systems underpinning uranium enrichment R&D and production infrastructure. This role is foundational to our reliability, safety, and developer velocity. You will be responsible for designing and maintaining observability, alerting, and developer productivity systems, and for ensuring that critical internal and production services are correctly instrumented and monitored. We are only interested in candidates with strong fundamentals, sound judgment, and the ability to operate with rigor in a production environment where failures matter.

Requirements

  • Strong fundamentals in web service development and distributed systems
  • Solid understanding of networking concepts, DNS, TLS/certificate management, and HTTP
  • Experience operating and debugging production systems
  • Familiarity with observability tools (metrics, logging, alerting) and incident response
  • Ability to write clear, maintainable code and automation scripts
  • Demonstrated ownership, attention to detail, and sound technical judgment
  • Ability to work extended hours and weekends as necessary.

Nice To Haves

  • Experience with modern observability stacks (e.g., Prometheus, Grafana, OpenTelemetry, Datadog)
  • Hands-on experience with cloud infrastructure and infrastructure-as-code
  • Exposure to CI/CD pipelines and developer tooling at scale
  • Experience supporting safety-critical or high-reliability systems
  • Strong debugging skills across application, OS, and network boundaries
  • Prior on-call experience in a production environment

Responsibilities

  • Design, implement, and maintain observability and alerting systems across critical services and infrastructure
  • Ensure all production and internal services are properly instrumented with metrics, logs, and traces
  • Own and maintain developer productivity tools, CI/CD systems, and internal platforms
  • Participate in an on-call rotation and respond to production incidents with urgency and discipline
  • Lead incident reviews and drive long-term reliability improvements
  • Automate operational workflows to reduce manual toil and improve system resilience

Benefits

  • access to medical, vision & dental coverage
  • access to a 401(k) retirement plan
  • long-term incentives, in the form of stock options
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service