System Software Engineer (Embedded)

Cerebras SystemsSunnyvale, CA
8h

About The Position

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference. Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation. The Role As part of the Embedded Software team, you will help build the critical software foundation that powers the Cerebras Wafer Scale Engine (WSE)—the world’s largest AI processor. Our team owns a diverse range of embedded and system level components that enable the WSE to operate reliably at scale, including microcontroller firmware, wafer level monitoring logic, system administration services, and the Linux platform and BSP layers that keep the entire system running smoothly. This role exists at the intersection of embedded systems, platform engineering, and distributed system enablement. As our technology and deployments continue to scale, we are expanding the team with versatile engineers eager to work across multiple layers of the software stack. You will help build administrative services that connect the WSE’s system software to cluster-level orchestration, collaborate closely with hardware and ASIC teams, and contribute to the robustness, visibility, and operability of our next-generation AI systems.

Requirements

  • Bachelor’s degree in computer engineering, Electrical Engineering, Computer Science, or related field.
  • 5+ years of experience in building production-quality software in C++ or Golang.
  • Solid understanding of embedded systems fundamentals or system hardware interactions.
  • Experience working in cross-functional engineering environments.

Nice To Haves

  • Master’s degree in computer engineering, Electrical Engineering, Computer Science, or related field.
  • Exposure to distributed systems, cluster-level orchestration, or datacenter environments.
  • Familiarity with Linux kernel concepts, device drivers, or BSP layers.
  • Experience debugging hardware/software interactions using tools such as logic analyzers, JTAG, or profiling/tracing frameworks.
  • Experience contributing to system monitoring, observability tooling, or hardware level telemetry pipelines.

Responsibilities

  • Develop administrative software that enables communication between system-level software and cluster-level control layers.
  • Provide and extend Linux BSP support, ensuring reliability and maintainability of system level platform components.
  • Collaborate across teams to gather requirements, define scope, plan milestones, and deliver high-quality implementations.
  • Work closely with datacenter operations and debug teams to diagnose system level issues, root cause failures, and implement fixes.
  • Partner with hardware and ASIC teams to design and implement software that monitors system hardware and wafer level behavior.
  • Contribute to improving system reliability, observability, and long-term maintainability across layers of the embedded stack.
  • Participate in code reviews, design discussions, and cross-team technical planning.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service