Senior SRE, Software Engineering (AWS / Scaling Infrastructure)

PulseRise TechnologiesNew York, NY
6hHybrid

About The Position

We are looking for two Senior Site Reliability Engineers to build and scale reliability foundations for a rapidly growing fintech platform. This role focuses on architecting resilient infrastructure, strengthening observability, and establishing sustainable SRE practices as systems scale from thousands to millions of users. You will lead incident response, design highly available cloud architectures, and ensure engineering teams can ship quickly without compromising reliability. The position requires deep AWS expertise, strong infrastructure-as-code experience, and a proactive reliability mindset. You will partner closely with feature teams to design scalable databases, async workflows, and data pipelines. This is a high-impact hybrid role based in NYC for engineers who thrive in fast-scaling environments.

Requirements

  • 5+ years of SRE/DevOps experience OR 7+ years of software engineering with strong infrastructure focus
  • Proven experience leading incident response for high-availability production systems
  • Strong AWS expertise (EC2, Fargate, networking, scaling strategies)
  • Experience with infrastructure as code (Terraform preferred)
  • Hands-on experience implementing observability solutions (Datadog, Prometheus, ELK, etc.)
  • Experience designing CI/CD pipelines and deployment automation
  • Strong knowledge of scalable system design and production reliability practices
  • Excellent documentation and cross-team communication skills

Nice To Haves

  • Experience scaling fintech or regulated systems
  • Experience working in high-performance engineering cultures
  • Evidence of entrepreneurial or high-initiative background
  • Experience designing async workflow infrastructure or high-scale data pipelines

Responsibilities

  • Lead incident response and establish sustainable on-call processes
  • Create comprehensive runbooks and foster blameless postmortem culture
  • Architect highly available, scalable cloud infrastructure on AWS
  • Design auto-scaling, health checks, and graceful degradation strategies
  • Implement and evangelize modern observability tooling (monitoring, logging, tracing)
  • Develop infrastructure as code using Terraform or CloudFormation
  • Build and improve CI/CD pipelines with advanced deployment strategies (blue/green, canary)
  • Partner with engineering teams to embed reliability into feature design
  • Improve database performance, async workflows, and data pipeline reliability
  • Reduce MTTR through systematic process and tooling improvements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service