Senior Manager, Site Reliability Engineering (Global)

Palo Alto Networks

About The Position

We are looking for a visionary Senior Manager of Site Reliability Engineering to lead our global SRE organization across the US, India, and Israel. This isn't just a "keep the lights on" role; you will be the primary architect of our AIOps transformation at Palo Alto Networks. You will bridge the gap between infrastructure products and operational excellence, gathering complex requirements from product teams and translating them into automated, intelligent platform capabilities to ensure our systems are not just reliable, but self-healing.

Requirements

10+ years in Infrastructure, SRE, or DevOps environments.
5+ years managing global teams of 15+ engineers across multiple time zones.
Deep understanding of Kubernetes, Cloud Native ecosystems (AWS/GCP/Azure), and CI/CD pipelines.
Proven track record of implementing ML-driven monitoring (e.g., anomaly detection, automated root cause analysis).
Exceptional ability to translate "deep tech" into business value for C-suite stakeholders.

Responsibilities

Directly manage and scale a high-performing, multi-geographical SRE team (US, India, and Israel), fostering a culture of psychological safety, continuous learning, and "operational pride."
Standardize SRE practices globally while respecting local nuances, ensuring 24/7 coverage models (Follow-the-Sun) are seamless and burnout-resistant.
Manage the financial aspects of global headcount and cloud infrastructure spend.
Drive the AIOps Roadmap: Transition the organization from reactive monitoring to proactive, AI-driven observability and incident remediation using machine learning to reduce Mean Time to Recovery (MTTR).
Act as the lead consultant for infrastructure product teams to define what "reliability" looks like for next-gen AI services.
Partner with the Platform Engineering team to build and internalize "Golden Paths" that bake in SLOs, error budgets, and automated canary analysis.
Work hand-in-hand with InfoSec and Compliance to automate guardrails (Policy-as-Code) and ensure global data sovereignty requirements are met.
Influence R&D leadership to prioritize non-functional requirements and technical debt reduction.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume