Site Reliability Engineer (Infrastructure)

FIS GlobalAtlanta, GA
2dHybrid

About The Position

We are seeking a highly skilled Operations & Support Lead/SME to lead our technical operations, incident/request/change management, database support, and cloud platform support functions. This role oversees teams responsible for delivering stable, secure, and high‑performing services across Windows Private Cloud, Azure Public Cloud, MSSQL, and infrastructure automation/tooling (Terraform/TFE, Jenkins, Harness, cloud deployment pipelines). The Lead will ensure operational excellence in a multi‑tenant, always‑on environment, while implementing industry best practices for reliability, automation, and service quality. This role also plays a key leadership function in supporting our ongoing migration and modernization initiatives across Azure. About the Team: The team is supporting the Treasury & Risk Platforms within the Capital Markets space. The team is doing Infra, SRE Operations support for various products/platforms in within the T&R space. Our support team is spread across the globe (US, UK, India, Australia, NZ, and Philippines) supporting almost a thousand clients.

Requirements

  • Strong hands‑on understanding of Azure Public Cloud operations.
  • Experience managing Windows Private Cloud or virtualized infrastructure.
  • Proficiency with MSSQL environments, database administration fundamentals, and troubleshooting.
  • Experience with CI/CD and automation tools such as Jenkins, Harness, Terraform/TFE.
  • Knowledge of networking, security models, monitoring systems, and high‑availability architectures.
  • Proven experience managing technical operations or cloud operations teams.
  • Strong understanding of ITIL processes (Incident, Problem, Change, Release).
  • Ability to manage multiple priorities, crisis situations, and operational escalations.
  • Excellent communication, stakeholder management, and decision‑making abilities.

Responsibilities

  • Service & Operations Management: Oversee day‑to‑day operations for infrastructure, application support, database support, and cloud platforms, ensuring high availability, performance, and reliability.
  • Manage and optimize cloud infrastructure (Azure) and private cloud environments, aligning with organizational needs.
  • Implement and maintain operational procedures, KPIs, SLAs, and governance for ITSM processes across Incident, Request, and Change Management.
  • Monitor system performance and proactively address disruptions, capacity issues, bottlenecks, or reliability concerns.
  • Coordinate with internal teams to ensure smooth integration between applications, databases, and cloud services.
  • Leadership & Team Management: Lead a cross‑functional team covering operations engineers, technical support specialists, and DBAs.
  • Provide coaching, mentoring, skill development, and performance management.
  • Ensure the team maintains depth and breadth of skills to support current and emerging technologies.
  • Foster a culture of continuous improvement, operational discipline, and customer obsession.
  • Cloud & Platform Engineering: Drive implementation of best practices for Azure cloud operations, including monitoring, configuration, scaling, and cost optimization.
  • Oversee automation initiatives using Terraform/TFE, Harness, Jenkins, and related pipeline tools.
  • Support infrastructure-as-code adoption and the automation of provisioning, deployments, and environment configuration.
  • Technical Support & Issue Resolution: Serve as the point of escalation for critical technical issues, ensuring prompt resolution through coordination across teams.
  • Ensure service tickets and incidents are handled within expected timelines and quality standards.
  • Drive root-cause analysis (RCA) and preventive measures to reduce recurring operational issues.
  • Database & Application Support: Oversee MSSQL database operations including performance tuning, maintenance, backup/restore, incident resolution, and monitoring.
  • Work with DBAs to ensure database availability and security across multi‑tenant architectures.
  • Security, Compliance & Risk Management: Ensure cloud and on‑prem systems comply with corporate security policies, data protection policies, and industry regulations.
  • Work closely with InfoSec and Architecture teams to ensure safe cloud adoption aligned with organizational standards.
  • Implement and maintain disaster recovery, backup strategies, and change control processes.
  • Vendor, Stakeholder & Cross‑Team Collaboration: Manage relationships with cloud service providers, tool vendors, and support partners.
  • Collaborate with product teams, engineering, on‑prem support teams, and business stakeholders to align services with business goals.
  • Support cross‑functional initiatives such as cloud modernization, platform enhancements, and operational readiness.

Benefits

  • Opportunities to innovate in fintech
  • Tools for personal and professional growth
  • Inclusive and diverse work environment
  • Resources to invest in your community
  • Competitive salary and benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service