Sr Manager, Site Reliability Engineering

PayPalScottsdale, AZ
19hHybrid

About The Position

The Company PayPal has been revolutionizing commerce globally for more than 25 years. Creating innovative experiences that make moving money, selling, and shopping simple, personalized, and secure, PayPal empowers consumers and businesses in approximately 200 markets to join and thrive in the global economy. We operate a global, two-sided network at scale that connects hundreds of millions of merchants and consumers. We help merchants and consumers connect, transact, and complete payments, whether they are online or in person. PayPal is more than a connection to third-party payment networks. We provide proprietary payment solutions accepted by merchants that enable the completion of payments on our platform on behalf of our customers. We offer our customers the flexibility to use their accounts to purchase and receive payments for goods and services, as well as the ability to transfer and withdraw funds. We enable consumers to exchange funds more safely with merchants using a variety of funding sources, which may include a bank account, a PayPal or Venmo account balance, PayPal and Venmo branded credit products, a credit card, a debit card, certain cryptocurrencies, or other stored value products such as gift cards, and eligible credit card rewards. Our PayPal, Venmo, and Xoom products also make it safer and simpler for friends and family to transfer funds to each other. We offer merchants an end-to-end payments solution that provides authorization and settlement capabilities, as well as instant access to funds and payouts. We also help merchants connect with their customers, process exchanges and returns, and manage risk. We enable consumers to engage in cross-border shopping and merchants to extend their global reach while reducing the complexity and friction involved in enabling cross-border trade. Our beliefs are the foundation for how we conduct business every day. We live each day guided by our core values of Inclusion, Innovation, Collaboration, and Wellness. Together, our values ensure that we work together as one global team with our customers at the center of everything we do – and they push us to ensure we take care of ourselves, each other, and our communities. Job Summary: We are seeking a strategic and technically strong Senior Manager of Production Operations to lead PayPal’s North America Command Center team. This high-impact leadership role is responsible for ensuring the reliability, scalability, and availability of our 24/7 mission-critical payment infrastructure. You will lead managers and team leads across multiple time zones, driving operational excellence, process maturity, and a culture of continuous improvement. The role also focuses on advancing engineering excellence through AIOps, automation, and proactive reliability practices to reduce MTTD and MTTR across the North America footprint. This is an exciting opportunity for a seasoned leader with deep technical expertise and a passion for operational excellence to make a meaningful impact at scale.

Requirements

  • 8+ years relevant experience and a Bachelor’s degree OR Any equivalent combination of education and experience.
  • Experience leading others
  • Bachelor’s degree in computer science, Information Technology, or related field; Master's preferred.
  • 8+ years of experience in infrastructure management, with at least 3 years in a leadership role.
  • Extensive experience with multiple cloud platforms (AWS, Azure, GCP) and on-premises infrastructure management.
  • Demonstrated experience building or scaling AI/ML-based automation for operations; including AIOps platforms, alert noise reduction, auto-remediation, and intelligent runbooks.
  • Strong background in incident management, ITIL frameworks, and operational best practices.
  • Experience with monitoring tools, automation platforms, and infrastructure-as-code technologies.
  • Proven track record of leading technical teams in high-pressure, mission-critical environments.
  • Excellent communication skills with ability to interact effectively with technical teams and executive leadership.

Nice To Haves

  • Relevant certifications (ITIL, cloud provider certifications, PMP).
  • Deep Knowledge of financial regulatory requirements (preferably).
  • Background in DevOps practices and CI/CD pipeline management.
  • Experience with containerization technologies and orchestration platforms.

Responsibilities

  • Manage and mentor a team of site reliability engineers, setting performance objectives, providing technical guidance, and ensuring alignment with business goals.
  • Oversee the execution of reliability initiatives, ensuring critical systems maintain high availability, resilience, and performance at scale.
  • Work with engineering, operations, and product teams to ensure seamless integration of reliability best practices into the development, deployment, and operational processes.
  • Lead incident management activities, including coordination of response efforts, root cause analysis, and implementing solutions to prevent future incidents.
  • Define and track key performance indicators (KPIs) related to system reliability, availability, and performance, reporting results to leadership regularly.
  • Promote and drive automation within the site reliability engineering team, ensuring processes are streamlined and systems operate with minimal manual intervention.
  • Manage capacity planning efforts, ensuring the scalability of systems and the ability to handle increasing traffic and resource demands effectively.
  • Ensure the development and testing of disaster recovery plans and procedures, minimizing downtime in the event of a failure.
  • Lead career development and mentorship efforts for team members, ensuring engineers have the tools and opportunities to grow their skills and advance their careers.
  • Work closely with leadership to align site reliability engineering goals with broader organizational objectives, ensuring engineering efforts support business continuity and growth.
  • Lead and manage Production operation Managers & leads, providing guidance on escalated technical issues and complex infrastructure challenges.
  • Oversee 24/7 monitoring and management of multi/hybrid cloud and on-premises infrastructure, ensuring optimal performance and availability.
  • Set clear goals, provide continuous coaching, and grow leadership capabilities within the team pipeline.
  • Champion automation-first practices: drive runbook automation, self-healing systems, and event-driven remediation pipelines.
  • Collaborate with cross-functional teams including DevOps, Security, and Application Development to resolve critical incidents and implement preventive measures.
  • Own and continuously improve SLAs, SLOs, and operational KPIs including MTTD, MTTR, incident frequency, and change failure rate.
  • Provide technical leadership during major incidents, coordinating response efforts and post-incident analysis.
  • Develop team capabilities through mentoring, training, and knowledge sharing programs.

Benefits

  • At PayPal, we’re committed to building an equitable and inclusive global economy. And we can’t do this without our most important asset-you. That’s why we offer comprehensive, choice-based programs, to support all aspects of personal wellbeing—physical, emotional, and financial—delivering meaningful value where it matters most. We strive to create a flexible, balanced work culture with a holistic approach to benefits, including generous paid time off, healthcare coverage for you and your family, and resources to create financial security and support your mental health.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service