Senior Site Reliability Engineer

Crisis Text LineChicago, IL
16h$115,192 - $160,018Remote

About The Position

At Crisis Text Line, our mission is to promote mental well-being for people wherever they are. Our technology powers real-time crisis support across the U.S. and globally, connecting people to help when they need it most. We’re looking for a Senior Site Reliability Engineer to strengthen and scale the infrastructure behind our crisis care platform. In this role, you’ll ensure our systems are reliable, resilient, and observable—ready for every moment someone reaches out. You’ll bridge development and operations, champion automation, and drive a culture of reliability across engineering. If you want to build systems that deliver help when it matters most, this is your opportunity.

Requirements

  • 6–8+ years in Infrastructure, SRE, Platform, or DevOps engineering with strong Python and Linux/Unix fundamentals.
  • Advanced AWS expertise (EC2, EKS/ECS, S3, IAM, VPC) and hands-on Kubernetes + Docker in production.
  • Proficiency with Terraform and Infrastructure as Code best practices.
  • Experience owning CI/CD pipelines, deployment automation, and promotion workflows.
  • Experience in observability + reliability skills (Datadog/Prometheus/Grafana, incident response, RCA).
  • Security-minded engineering approach, ideally with exposure to regulated or healthcare environments.
  • Must have a stable high-speed internet connection to support seamless remote collaboration, virtual meetings, online job tasks, etc.

Nice To Haves

  • Strong architectural thinking and ability to modernize legacy systems.
  • Experience with trunk-based development, GitOps practices, or developer tooling.
  • Demonstrated mentorship, technical guidance, or influence on engineering best practices.
  • Effective collaborator with high ownership, able to operate independently and support developers.
  • AWS cost-optimization experience or familiarity with Aurora/database performance.
  • Experience working on mission-critical, distributed, or global-scale platforms.

Responsibilities

  • Develop and maintain automation tools and frameworks to reduce manual operations.
  • Implement Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or similar.
  • Build and manage CI/CD pipelines to support rapid, reliable deployments.
  • Create self-service tooling and platforms that empower development teams.
  • Design, implement, and maintain scalable, reliable, and secure infrastructure to support business-critical applications.
  • Lead incident response, conduct root-cause analysis, and implement long-term preventive fixes.
  • Optimize system performance through capacity planning and resource utilization improvements
  • Design and implement robust monitoring, logging, and alerting systems.
  • Build dashboards and metrics that provide visibility into system and service health.
  • Establish observability best practices across microservices and distributed systems.
  • Reduce alert fatigue through intelligent alerting, automation, and clear runbooks.

Benefits

  • 20 paid holidays ,including: Federal holidays like Juneteenth and Labor Day Election day Holiday break from Dec 24 through January 1 2 renewal days 2 floating holidays
  • Flexible paid time off, including: 15 vacation days 3 personal days 7 sick days
  • Medical, dental, and vision benefits for the staff member and family at no cost to the employee
  • 403B retirement plan (the nonprofit equivalent of a 401K): 3% contribution by Crisis Text Line to support building financial wellness, regardless of personal contribution
  • 12 weeks paid parental leave (after 6 months of employment)
  • Student loan repayment (after 2 years of continuous full-time service)
  • Family support through a virtual childcare platform
  • Stipends/Allowances Mental health (Monthly) Internet Service (Monthly) Professional Development (Annual) Wellness (Annual) Home office setup (One-time/First year)
  • (Benefits are only for US-based employees. International benefits may differ)
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service