Problem Management Service Analyst-Consultant - Remote

AllstateMcCullom Lake, IL
3dRemote

About The Position

At Allstate, great things happen when our people work together to protect families and their belongings from life’s uncertainties. And for more than 90 years, our innovative drive has kept us a step ahead of our customers’ evolving needs. From advocating for seat belts, air bags and graduated driving laws, to being an industry leader in pricing sophistication, telematics, and, more recently, device and identity protection. Job Description Allstate operates a global Service Operations capability across the UK, India, and the US, bringing together Major Incident Management, Network Operations, and Problem Management to protect and enhance the availability of our critical Digital Products. As a Problem Management Service Analyst, you will play a key role within a mature, outcome driven Problem Management team, working in close partnership with Digital Product and Engineering teams operating within an Outcome Based Delivery (OBD) model. You will lead enterprise-wide problem investigations driven by major incidents and proactive analysis, with a strong focus on identifying true root causes and enabling sustainable improvements to product reliability and resilience. Success in this role is measured by the quality of problem investigations, reduction in repeat incidents, and the extent to which learning drives measurable improvements in product availability. The role provides exposure to enterprise level reliability challenges and close collaboration with senior engineering teams, supporting development into senior problem, reliability, or service leadership roles.

Requirements

  • Minimum 2 years’ experience in Problem Management, Incident Management, or Service Operations role within a production operations or service environment.
  • Hands-on experience with ServiceNow for Problem and Incident management
  • Demonstrated experience driving structured problem investigations for major or high impact incidents, including root cause identification, documented causal analysis, and driving corrective or preventative actions through completion.
  • Experience using data visualization or reporting tools (e.g., Power BI, ServiceNow)

Nice To Haves

  • Experience working with observability, monitoring, or telemetry data (e.g., logs, metrics, traces)
  • Exposure to reliability or resilience practices (e.g., SRE concepts, error budgets, availability targets, resilience testing, or failure mode analysis) within a production environment.
  • Experience operating in Agile, DevOps, or product centric delivery models

Responsibilities

  • Drive enterprise problem investigations arising from major incidents and proactive analysis, working in close partnership with Digital Product and Engineering teams to identify true root causes and prevent recurrence.
  • Analyze incidents, problems and availability data to identify systemic risks, recurring failure patterns, and reliability gaps, translating insights into actionable improvement opportunities for Digital Products.
  • Partner with Digital Product and Engineering teams to strengthen service resilience, including improvements to monitoring, alerting, recovery, and preventative controls that reduce customer impact.
  • Use learnings from problem investigations to influence improvements in automated service restoration and operational readiness, maintaining a strong focus on availability outcomes.
  • Contribute to Major Incident Management and Retro activities when required, providing investigative insight, historical context, and problem-oriented thinking during high severity events.
  • Continuously improve problem management practices, tooling, and ways of working, partnering with Digital Product and Engineering teams to embed learning and prevention and drive meaningful, lasting change.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service