Site Reliability Engineering (SRE) Intern

AWP SafetyNorth Canton, OH
18h$30 - $34Remote

About The Position

The AWP Safety IT Internship Program immerses you in provides a‑hands‑on, high‑impact learning experience designed for early‑career professionals who want to build a future in IT Site Reliability Engineering. In this role, you won't just be watching application performance monitoring dashboards; you will be building the observability pipelines that keep our infrastructure and applications resilient, highly available, and robust. You will work at the intersection of Software Engineering and Systems Operations, using Dynatrace as your primary lens to diagnose performance bottlenecks and automate "toil" out of existence. While this internship is primarily project‑based and can be remote depending on location, interns will also have opportunities to collaborate closely with cross‑functional teams to understand how technical insights drive real‑world business outcomes. This 10-week internship places interns at the center of our IT operations, offering meaningful work with real organizational impact. You’ll thrive if you have a passion for “measuring everything”. You’ll collaborate closely with Platform, AppDev, and Security teams on production‑grade outcomes for our business. Join us for an IT internship that strengthens your technical abilities, builds your professional confidence, and prepares you for a future in high‑impact SRE roles. Apply today and help shape the technical insights that power AWP Safety.

Requirements

  • Rising junior/senior or current master’s student.
  • Clear communication and teamwork skills in fast‑moving ops environments.
  • Systems Thinking: Understand how web apps, databases, and networks interact.
  • SRE Mindset: Care deeply about reliability, scalability, and error budgets.
  • Scripting Proficiency: Familiarity with Python, Go, or PowerShell.
  • Cloud Basics: Exposure to containers (Docker/Kubernetes) and microservices patterns.
  • Data Fluency: Read metrics, logs, and traces to tell a story about system health.
  • Clear communication and teamwork in fast‑moving ops environments.

Responsibilities

  • Observability‑as‑Code: Help deploy and configure Dynatrace OneAgent and ActiveGates with automated tooling.
  • SLI/SLO Implementation: Define and instrument user‑centric metrics and objectives in Dynatrace.
  • AI‑Assisted Troubleshooting: Combine Davis® AI with Copilot/Claude to identify root causes and reduce MTTR.
  • Dashboard Engineering: Build actionable, real‑time dashboards for application and cloud health.
  • Automation & Scripting: Write Python/Bash to trigger self‑healing or response playbooks from alerts.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service