Senior Observability Engineer

UnitedHealth Group•La Crosse, WI

1d•$91,700 - $163,700

About The Position

Opportunities with Logistics Health Incorporated (LHI), part of the Optum family of business. We’re dedicated to simplifying the logistics of complex workforce health programs with cost-effective solutions and a seamless distribution process. With offices in La Crosse, Wis., a satellite office in Chicago and remote employees throughout the country, we have a variety of rewarding career opportunities for you. Elevate your career as you help us create a healthier tomorrow for everyone and discover the meaning behind Caring. Connecting. Growing together. OptumServe Enterprise Monitoring team is looking for a Senior Observability Engineer. The team is responsible for enterprise infrastructure, application, and network monitoring for on-prem, hybrid, and various Clouds. The selected candidate will be joining a team of skilled engineers with a broad background in enterprise monitoring and Observability. As an Observability Engineer, this role is focused on maintaining the reliability, scalability and availability of our Log management solution as well as our Metrics and Observability platform which heavily uses automation (terraform, Ansible and scripts), this role requires maintaining performance KPI of our solutions and defining their SLOs.

Requirements

2+ years of experience working directly with monitoring tools as either an Admin, SME or as an Architect, preferably with Dynatrace and/or ElasticSearch
2+ years of experience with Dynatrace (managed, cloud as well as offline, with full scope of best practices and setup as it relates to Active gate, cloud, on-prem and custom with workflows), or with Elastic on-prem and cloud with best practices around the platform
1+ years of experience with designing data pipelines using filebeat, Logstash and/or fluentbit/fluentd
1+ years of AI expertise as it relates to Observability to reduce the amount of work, and make our products more reliable and resilient
1+ years of experience writing scripts in languages like Python and (Bash or powershell) to automate tasks
1+ years of experience working with Linux OS
United States Citizenship
If you are offered this position, you will be required to provide extensive personal information to obtain and maintain a suitability or determination of eligibility for a Confidential/Secret or Top Secret security clearance as a condition of your employment

Nice To Haves

BS/MS in CS/engineering or equivalent OR 5+ years of experience
1+ years of experience in Terraform and Ansible. Syntax, best practices, and managing complex configurations in multi commercial and Gov clouds to build and manage infra and applications
1+ years of scripting experience (JavaScript, Java, PowerShell, or others)
SNMP, TCP dump and tracing
Proven knowledge of AIOPS platform

Responsibilities

Maintain and deploy monitoring and alerting
Design, configuration and maintenance of log aggregation solution at a large scale
Set up and manage ingestion pipelines and data transformations
Have the mindset of “automate any task”
Monitoring and Alerting: Build and maintain robust monitoring systems using tools like Elk, Dynatrace, Prometheus, OTEL and Grafana to detect potential issues early and trigger alerts for timely response
Maintain associated documentation as it applies to our audit and certification requirements
Participate in troubleshooting, capacity planning, and performance analysis activities
Research new monitoring requirements and in many cases write code for that
Medium to expert level in setting up AI rules for tools like DavisAI (Dynatrace) and/or Elastic GenAI
Solid expertise in setting up monitoring policies/rules/templates; and writing scripts to accomplish monitoring requirements