Data Center IT Manager (Tulsa, Oklahoma)

NebiusTulsa, OK
1dOnsite

About The Position

Nebius is leading a new era in cloud computing to serve the global AI economy. We create the tools and resources our customers need to solve real-world challenges and transform industries, without massive infrastructure costs or the need to build large in-house AI/ML teams. Our employees work at the cutting edge of AI cloud infrastructure alongside some of the most experienced and innovative leaders and engineers in the field. Where we work Headquartered in Amsterdam and listed on Nasdaq, Nebius has a global footprint with R&D hubs across Europe, North America, and Israel. The team of over 800 employees includes more than 400 highly skilled engineers with deep expertise across hardware and software engineering, as well as an in-house AI R&D team. The Team You will join a fast-growing Neocloud infrastructure organization operating at the intersection of AI, high-performance computing (HPC), and hyperscale cloud services . Our data centers host large-scale GPU clusters purpose-built for AI training, inference, and accelerated workloads. As global data center operations expand, this role offers significant career progression , direct involvement in new data center builds and expansions , and exposure to industry-leading GPU platforms . Your work will have a measurable impact on platform reliability, customer SLAs, and operational efficiency , while collaborating closely with experts in AI infrastructure design, cloud operations, and data center engineering . This is a highly collaborative, innovation-driven environment where best practices are continuously refined to exceed industry standards in design, deployment, and operations . The Role As Data Center IT Manager , you will be responsible for the end-to-end operation, planning, and continuous improvement of IT infrastructure within our Neocloud data centers. You will lead and develop a high-performing IT support team responsible for GPU-dense cloud clusters , including advanced platforms such as NVIDIA H200-based systems . The role blends hands-on technical leadership with strategic planning, project delivery, compliance oversight, and vendor management . You will work closely with internal stakeholders (Cloud Engineering, Network, Security, Legal, Procurement) and external partners (OEMs, colocation providers, integrators) to ensure infrastructure is scalable, compliant, cost-effective, and operationally excellent . On-site work at colocation facilities in Tulsa, Oklahoma is supported and encouraged as part of hands-on operational leadership.

Requirements

  • 5+ years of experience leading technical teams in data center or cloud infrastructure environments
  • Strong hands-on expertise with server hardware , including GPU-accelerated systems and high-density deployments.
  • Solid understanding of data center operations , networking fundamentals, and enterprise infrastructure design.
  • Proven experience in project planning, execution, and delivery within complex technical environments.
  • Working knowledge of ITIL/ITSM frameworks and operational service management.
  • Proficiency with Linux systems and command-line troubleshooting.
  • Experience with enterprise network switches, fiber optics, and structured cabling
  • Strong analytical skills with advanced Excel (pivot tables, formulas, reporting, and visualization).
  • Proactive, ownership-driven mindset with strong organizational and leadership skills.

Nice To Haves

  • ITIL, PMI, or equivalent project management certifications
  • Experience working in Neocloud, hyperscale, or AI/HPC environments
  • Familiarity with GPU lifecycle management, firmware, and platform validation
  • Exposure to contract reviews, vendor negotiations, or procurement processes
  • Experience supporting compliance audits or regulatory frameworks (e.g., ISO, SOC, customer audits)

Responsibilities

  • Own the availability, performance, and lifecycle management of data center IT infrastructure with strong emphasis on GPU-based compute platforms
  • Lead troubleshooting and root-cause analysis for GPU servers, high-density racks, networking, and storage , ensuring minimal customer impact.
  • Ensure operational readiness for large-scale AI workloads , including thermal, power, and cabling considerations specific to GPU clusters.
  • Oversee hardware installation, upgrades, decommissioning, and migrations aligned with cloud growth and customer demand.
  • Lead capacity planning for GPU, CPU, networking, and rack space based on growth forecasts and customer commitments.
  • Manage cross-functional infrastructure projects , including new deployments, expansions, and technology refresh cycles.
  • Define project scopes, timelines, risks, and dependencies, ensuring delivery on time and within budget
  • Drive continuous improvement through post-incident reviews and project retrospectives
  • Manage and mentor the IT support team to meet or exceed KPIs, SLAs, and operational objectives
  • Establish and optimize IT operational processes aligned with ITIL/ITSM best practices.
  • Develop and maintain high-quality technical documentation , runbooks, and operational procedures.
  • Ensure infrastructure operations comply with data center regulations, security standards, customer contractual requirements, and data sovereignty obligations
  • Partner with Legal, Security, and Compliance teams on audits, certifications, and regulatory requirements
  • Support incident response and investigations with accurate technical documentation and evidence.
  • Ensure asset management, chain-of-custody, and decommissioning processes meet compliance and audit standards
  • Act as a key technical stakeholder in vendor selection, evaluations, and negotiations with OEMs, hardware vendors, colocation providers, and service partners.
  • Manage RMAs, warranty processes, and vendor escalations , ensuring timely resolution.
  • Contribute to commercial discussions , cost optimization initiatives, and long-term vendor strategies.
  • Review technical aspects of contracts, SLAs, and statements of work to ensure operational feasibility and risk mitigation.

Benefits

  • Competitive salary and comprehensive benefits package.
  • Opportunities for professional growth within Nebius.
  • Flexible working arrangements.
  • A dynamic and collaborative work environment that values initiative and innovation.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service