Microsoft-posted 3 days ago
$100,600 - $199,000/Yr
Mid Level
Redmond, WA
Publishing Industries

At Microsoft, we're a community of passionate innovators driven by curiosity and purpose. We collaborate to imagine what's possible and accelerate our careers in a cloud-powered world where openness and innovation unlock limitless potential. Artificial Intelligence is central to Microsoft's strategy-and the Azure AI Platform is leading the charge. As part of our team, you'll contribute to cutting-edge projects that solve real-world challenges using transformative technologies. We are looking for a Software Engineer 2 - Core AI to join our agile team at the core of Microsoft's AI infrastructure. This team is building the Next Gen Scheduling & Optimization Platform-a foundational infrastructure layer that powers OpenAI models and other large-scale AI workloads across Azure. In this role, you will be responsible for managing inferencing capacity that fuels Microsoft's AI ambitions. Our fleet of premium AI accelerators runs state-of-the-art OpenAI models, forming the backbone of Microsoft's Copilots and the Azure OpenAI Service. You'll help dynamically allocate resources across models and customer offerings, monitor usage in near real-time, and rebalance capacity to drive massive efficiency gains. You'll work on high-impact distributed systems that support low-latency, high-volume, mission-critical customer scenarios, solving complex challenges in resource orchestration, telemetry, and performance optimization. You'll collaborate across Azure, OpenAI, CoreAI Services, and infrastructure teams to shape the future of scalable, cost-efficient AI.

  • Design and implement scalable services for GPU scheduling, allocation, and optimization across diverse AI workloads.
  • Build reliable orchestrations to monitor GPU usage near real time and drive automated rebalancing decisions.
  • Integrate with fleet health dashboards and GPU lifecycle management systems to ensure reliability and performance.
  • Collaborate with partner teams across Azure ML, AOAI, and Core AI to align architecture, APIs, and operational readiness.
  • Contribute to platform evolution supporting new hardware and real-time inference APIs.
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C#, Go, or Python.
  • 2+ years of experience with distributed systems or cloud infrastructure.
  • 2+ years of experience with telemetry, metrics pipelines, or resource scheduling systems.
  • Familiarity with cloud platforms (Azure, AWS, GCP) and container orchestration (Kubernetes, Service Fabric).
  • Exposure to GPU-based workloads, model serving, or AI infrastructure.
  • Experience working with real-time systems or high-throughput APIs.
  • Base pay range for this role across the U.S. is USD $100,600 - $199,000 per year.
  • Base pay range for this role in the San Francisco Bay area and New York City metropolitan area is USD $131,400 - $215,400 per year.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service