Amazon.composted 3 days ago
$151,300 - $261,500/Yr
Full-time • Senior
Bellevue, WA
General Merchandise Retailers

About the position

AWS Infrastructure Services (AIS) owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain - and we're looking for talented people who want to help. Join the DC Bridge team in pioneering next-generation AI/ML solutions that power AWS's global data center operations. We're building cutting-edge systems that orchestrate physical work processes across AWS's worldwide data centers, directly impacting millions of customers who rely on AWS services. You'll be at the forefront of transforming data center operations through AI/ML innovations, developing intelligent systems that optimize technician workflows, automate decision-making processes, and enhance operational efficiency across AWS's global infrastructure.

Responsibilities

  • Lead and architect the development of state-of-the-art AI/ML platforms and solutions, serving as a technical leader for both data center operations and engineering teams
  • Own end-to-end delivery of technically challenging projects, including scalable ML frameworks, deployment pipelines, and intuitive interfaces for non-ML experts
  • Drive operational excellence by building robust data processing pipelines and ETL systems for DC telemetry data, while identifying and addressing operational challenges early
  • Mentor and grow junior engineers, acting as a force multiplier by sharing AI/ML expertise and best practices
  • Lead cross-functional collaboration efforts to integrate ML solutions into existing DC workflows, while maintaining highest standards for system extensibility and scalability
  • Contribute to and champion improvements in development processes, particularly in the context of ML development and deployment
  • Provide technical consultation and architectural guidance to internal customers while insisting on the highest standards for long-term system sustainability
  • Design and implement reusable components and tools that enhance team productivity and system reliability

Requirements

  • 8+ years of software development experience with proven expertise in Python, Java, or equivalent languages
  • Strong background in one or more of: Frontend development and UI/UX design, Platform engineering (SDKs/Frameworks), ETL and large-scale data processing, DC telemetry systems, Machine learning (specializing in anomaly detection, classification, time series analysis), Solution architecture and technical advisory
  • Hands-on experience with modern ML frameworks (TensorFlow, PyTorch, SageMaker, Bedrock)
  • Track record of mentoring and technical leadership
  • Excellence in problem-solving and communication

Nice-to-haves

  • Thrives in ambiguous environments and adapts quickly to change
  • Demonstrates a scrappy mindset with ability to deliver results in fast-paced settings
  • Maintains deep technical expertise while staying customer-focused
  • Shows passionate engagement with AI/ML advancements
  • Possesses strong understanding of AI/ML technology application (LLMs, agents, RAG, ML models)
  • Works autonomously and demonstrates deep problem-solving capabilities
  • Balances subtle improvements with disruptive innovation when needed

Benefits

  • 401k
  • health_insurance
  • dental_insurance
  • vision_insurance
  • life_insurance
  • disability_insurance
  • paid_holidays
  • paid_volunteer_time
  • tuition_reimbursement
  • professional_development
  • flexible_scheduling
  • employee_stock_purchase_plan
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service