Sr Engineer, AI Ops(70009976)

OptimumPlano, TX
1d$83,538 - $119,340

About The Position

The AI Ops Engineer builds and operates the systems that enable AI-driven automation, anomaly detection, and intelligent decisioning across operational platforms. This role bridges observability, data engineering, and applied machine learning to productionize AI capabilities at scale. At Optimum, we're fueled by our four core pillars: Taking Ownership, Upholding Transparency, Creating Community, and Demonstrating Expertise. Our commitment to empowering employees to take responsibility and embrace proactive problem-solving underpins Taking Ownership. Upholding Transparency is at the core of our culture, with open and honest communication fostering trust among our dedicated team and loyal customers. Creating Community is more than a goal; it's our daily commitment to fostering an environment of collaboration, innovation, and positivity. Demonstrating expertise is a promise we uphold through continuous learning and engagement with our customers to consistently deliver top-quality products and services. These pillars not only shape our culture but define Optimum as a place of excellence, trustworthiness, and thriving community, and we invite you to be a part of our journey.

Requirements

  • 5+ years of experience in operations engineering, platform engineering, or ML systems
  • Experience deploying ML or statistical models in production
  • Strong Python skills and familiarity with ML frameworks or libraries
  • Experience working with observability or operational data
  • Understanding of model monitoring, drift, and lifecycle management
  • Experience working within an Agile or SAFE Agile work model

Nice To Haves

  • Experience with AIOps platforms or custom AIOps implementations
  • Familiarity with time-series data, anomaly detection, or forecasting
  • Exposure to automation frameworks and ITSM integration
  • Experience operating AI systems in regulated or mission-critical environments

Responsibilities

  • Design and operate AI Ops pipelines for anomaly detection, prediction, and automation
  • Integrate AI models with observability, ticketing, and automation platforms
  • Deploy and monitor models in production environments
  • Build feedback loops for model retraining and improvement
  • Define operational metrics for AI effectiveness and trust
  • Partner with SRE, NOC, and platform teams on AI-assisted workflows
  • Implement guardrails, explainability, and confidence scoring
  • Support incident response and root cause analysis using AI techniques
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service