NOC Infrastructure Engineer, Data Center

Aqueduct Technologies Inc.Canton, MA
6d

About The Position

Aqueduct Technologies is seeking a Staff-level NOC Infrastructure Engineer to serve as a technical leader and escalation point within the Network Operations Center (NOC) team, specializing in data center infrastructure and hybrid cloud environments. You will own complex incident resolution and deep technical investigations across servers, virtualization, storage, backup/recovery, and identity, with a strong emphasis on resilience, recoverability, and operational excellence. The NOC Infrastructure Engineer embodies our values of professionalism, empathy, and technical excellence. We’re looking for someone who operates with a high level of autonomy and accountability, driving incident resolution, strengthening documentation and standards, and mentoring others to improve team capability and reliability of services. Staff-Level Expectations: Perform the role independently with little need for oversight except with particularly difficult issues or situations. Serve as the primary ticket owner for infrastructure queue work based on assignment, expertise, and business need. Invest dedicated time in training with the goal of becoming a Subject Matter Expert in at least half of Aqueduct’s core infrastructure applications/support areas. Appropriately engage the Lead or higher-level resource when significant issues arise. Participate in the On-Call after-hours support rotation.

Requirements

  • Bachelor’s degree in Information Technology, Computer Science, or related field (or equivalent experience).
  • 5+ years of experience of progressive experience supporting data center infrastructure.
  • Demonstrated experience owning complex incidents independently (including after-hours/high-impact events).
  • Strong troubleshooting ability across virtualization, storage, and server infrastructure.
  • Hands-on experience with backup/restore operations in Veeam and/or Rubrik.
  • Working knowledge of Azure operations and Entra ID administration in support of hybrid environments.
  • Strong change hygiene: risk assessment, rollback planning, and clean validation practices.
  • Clear, disciplined documentation and customer-facing communication.
  • Mentor who guides and uplifts junior and mid-level engineers.

Nice To Haves

  • Experience in an MSP or multi-tenant environment.

Responsibilities

  • Data Center Infrastructure Operations
  • Support and troubleshoot customer data center environments including:
  • Physical servers, firmware/driver baselines, hardware health, and break/fix triage
  • Virtualization platforms (e.g., VMware vSphere/ESXi and/or Hyper-V) including cluster health, resource contention, HA/DRS behavior, datastore and VM performance issues
  • Storage systems (SAN/NAS/Storage arrays) including capacity, performance, multipathing, snapshots, and connectivity considerations
  • Execute and improve operational practices
  • Patch management coordination (host/guest) with clear risk controls and validation steps
  • Capacity and performance monitoring; identify bottlenecks and single points of failure
  • Standardize configurations and baselines to reduce incidents
  • Backup, Recovery, and Resilience
  • Administer and support Veeam and Rubrik environments, including:
  • Job health, repository/cluster capacity, retention policies, and alerting
  • Restore operations (file, application, VM, and full environment recovery) under time pressure
  • Recovery testing, validation, and documentation (runbooks, RTO/RPO alignment)
  • Improve recoverability posture (immutability where applicable, ransomware recovery workflows, least-privilege access)
  • Proactively identify backup gaps and propose improvements to reduce exposure and improve recovery outcomes.
  • Hybrid Cloud & Identity
  • Administer and troubleshoot Azure core infrastructure services:
  • Networking/connectivity (VNets, routing, firewalling patterns), compute, storage, and governance (policy, tagging/standards)
  • Monitoring and alerting via Azure Monitor / Log Analytics (or equivalent) with actionable signal quality
  • Manage and troubleshoot Microsoft Entra ID (Azure AD)
  • Identity lifecycle basics, access administration, RBAC, privileged access patterns (e.g., PIM)
  • Conditional Access design/troubleshooting and secure authentication policies (MFA, identity protection concepts)
  • Support hybrid identity dependencies as relevant (e.g., AD integration, directory sync/identity flows) without turning the role into “pure Microsoft admin.
  • Incident Ownership, Documentation, and Continuous Improvement
  • Lead escalated ticket troubleshooting using Aqueduct processes and best practices.
  • Maintain exceptional notes, clear reasoning, and step-by-step detail in all work.
  • Review customer environments proactively to identify risks, misconfigurations, or single points of failure.
  • Keep documentation updated, accurate, and thorough for all identity, access, and cloud services.
  • Identify recurring issues and propose improvements that reduce future incidents (automation, standards, documentation, monitoring).
  • Change Control
  • Fully participate in and comply with Aqueduct’s Change Control processes.
  • Plan and execute infrastructure-impacting changes carefully, following risk-mitigation and rollback best practices.
  • Training & Growth
  • Maintain and pursue deep technical specialization across Azure, Entra, Veeam, and Rubrik.
  • Engage in cross-training to expand into additional NOC practice areas.
  • Act as a mentor and knowledge resource to junior NOC engineers.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service