ServiceNowposted 3 days ago
$163,600 - $286,300/Yr
Full-time • Senior
Kirkland, WA

About the position

In this role, you will be responsible for designing and deploying networks based on business and technical requirements. You will partner with project and program managers to meet overall timelines and resolve issues. Your duties will include operating and troubleshooting networks to quickly identify and resolve issues, taking a lead role in engaging and mitigating outage-causing events, and validating problem descriptions while performing detailed problem diagnosis. You will also engage deeply in the sustainment function to proactively analyze network parameters such as capacity and availability to ensure issues are fixed before they cause an outage. Additionally, you will review, consult, and prepare for planned change introductions to the production environment, participate in a rotating 'on call' schedule with other team members, and partner with teams to plan and execute software code upgrades and device maintenance. You will also provide mentorship and input on operational process improvements to the Site Reliability Engineering (SRE) team and give feedback to infrastructure architects on design issues or improvements.

Responsibilities

  • Design and deploy networks based on business and technical requirements.
  • Partner with project and program managers to meet overall timelines and resolve issues.
  • Operate and troubleshoot networks to identify and resolve issues quickly.
  • Take a lead role in the engagement and mitigation of outage-causing events or issues.
  • Validate problem descriptions and perform detailed problem diagnosis; track and update problems in trouble-ticketing system.
  • Perform non-critical investigations into functionality that is not working as desired.
  • Engage deeply in the sustainment function to proactively analyze network parameters such as capacity and availability.
  • Review, consult, and prepare for planned change introduction to production environment.
  • Participate in rotating 'on call' schedule with other members of the team including weekends.
  • Partner with teams to plan and execute software code upgrades and device maintenance.
  • Partner with the Site Reliability Engineering (SRE) team to provide mentorship and input on operational process improvements.
  • Provide feedback to infrastructure architects on design issues or improvements.

Requirements

  • Experience in leveraging or critically thinking about how to integrate AI into work processes, decision-making, or problem-solving.
  • 6+ years of experience in software development with a focus on network systems and technology.
  • Experience with container technologies (Docker, Podman) and Kubernetes orchestration platforms.
  • Knowledge of load balancing technologies: F5, Envoy, NGINX+, and Cilium.
  • Advanced proficiency with Terraform for multi-region infrastructure provisioning.
  • Experience with Ansible for configuration management and automated deployments.
  • Deep knowledge of Layer 4/Layer 7.
  • Experience with large-scale infrastructure migrations.
  • Understanding of certificate management, TLS termination, and traffic routing strategies.
  • Experience with infrastructure monitoring, metrics collection, and observability platforms such as Prometheus, Grafana, or Splunk.
  • Knowledge of network security tools including nftables, iptables, and firewall management.
  • Background in enterprise-scale, high-availability infrastructure supporting global operations.

Benefits

  • Health plans, including flexible spending accounts.
  • 401(k) Plan with company match.
  • Employee Stock Purchase Plan (ESPP).
  • Matching donations.
  • Flexible time away plan.
  • Family leave programs.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service