Join our team building the scale-out networking backbone that powers the world's largest AI training clusters. We're developing high-performance RDMA and RoCE solutions that enable distributed training of trillion-parameter models across thousands of compute nodes on AWS infrastructure. Our team is responsible for creating the networking software that connects massive AI accelerator clusters, focusing on SmartNIC integration, collective communication optimization, and ultra-high-bandwidth inter-rack connectivity. As a senior engineer, you'll drive technical architecture decisions and lead the development of next-generation distributed AI training infrastructure.