Principal DevOps Engineer - Federal

ZscalerSan Jose, CA
1dRemote

About The Position

Zscaler accelerates digital transformation to ensure our customers can be more agile, efficient, resilient, and secure. As an AI-forward enterprise, we are constantly pushing the envelope, leveraging the world’s largest security data lake to power our cloud-native Zero Trust Exchange platform. This innovation protects our customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location. Here, impact in your role matters more than title and trust is built on results. We say, impact over activity. We seek innovators who actively use AI to amplify their impact and who thrive in an environment where we leverage intelligent systems to stay ahead of evolving threats. We believe in transparency and value constructive, honest debate—we’re focused on getting to the best ideas, faster. We build high-performing teams that can make an impact quickly and with high quality. To do this, we are building a culture of execution centered on customer obsession, collaboration, ownership, and accountability. We value high-impact, high-accountability with a sense of urgency where you’re enabled to do your best work and embrace your potential. If you’re driven by purpose, thrive on solving complex challenges, and want to be part of the team that’s helping to secure the AI age, we invite you to bring your talents to Zscaler and help shape the future of cybersecurity. Role We are looking for a Principal DevOps Engineer to join our team. This role is remote position, reporting to the VPE of Infrastructure. You will architect and manage the global cloud infrastructure for our Zero Trust Networking Services, owning the delivery pipeline and operational health of a massive-scale, multi-region distributed system. Bridging the gap between Network Engineering and Cloud Operations, you will orchestrate highly scalable, distributed control and data plane services in the public cloud.

Requirements

  • 12+ years of overall experience in Software Engineering, DevOps, or Site Reliability Engineering combined with a BS/MS in Computer Science or relevant field
  • Deep mastery of AWS services and architecting AWS-managed SQL and NoSQL data stores, with a focus on designing scalable local and multi-region deployment strategies
  • Advanced expertise in Infrastructure as Code using Terraform and expert proficiency in Python or Go for building automation tooling
  • Strong knowledge of Linux/BSD internals, observability stacks (Prometheus, InfluxDB), and security compliance (PKI, IAM, DevSecOps)
  • U.S. citizenship due to the nature of the customers assigned to this role

Nice To Haves

  • Experience managing large-scale distributed systems or IoT-style connectivity platforms
  • Background in Network Engineering or working with Data Plane forwarding technologies
  • Experience with Chaos Engineering methodologies in a public cloud environment

Responsibilities

  • Design and implement a multi-region AWS architecture and lead the development of modular Terraform libraries to automate provisioning across diverse geographies
  • Architect self-healing infrastructure using advanced cloud load balancing, auto-scaling patterns, and Multi-AZ database topologies to ensure high availability
  • Modernize CI/CD pipelines and implement Blue/Green and Canary deployment strategies to ensure zero-downtime upgrades for a continuously running global network service
  • Build comprehensive SRE dashboards and implement intelligent alerting frameworks to detect regional outages or capacity exhaustion before they impact customers
  • Monitor cloud resource utilization and implement scaling policies that perfectly balance performance requirements with cost-efficiency

Benefits

  • Various health plans
  • Time off plans for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks, and more!
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service