Senior DevOps Engineer

Astera Labs•San Jose, CA

10h

About The Position

Astera Labs (NASDAQ: ALAB) provides rack-scale AI infrastructure through purpose-built connectivity solutions. By collaborating with hyperscalers and ecosystem partners, Astera Labs enables organizations to unlock the full potential of modern AI. Astera Labs’ Intelligent Connectivity Platform integrates CXL®, Ethernet, NVLink, PCIe®, and UALink™ semiconductor-based technologies with the company’s COSMOS software suite to unify diverse components into cohesive, flexible systems that deliver end-to-end scale-up, and scale-out connectivity. The company’s custom connectivity solutions business complements its standards-based portfolio, enabling customers to deploy tailored architectures to meet their unique infrastructure requirements. Discover more at www.asteralabs.com. We are seeking a skilled Senior DevOps Engineer to join our Silicon Engineering Infrastructure team. In this role, you will be instrumental in building, maintaining, and optimizing cloud-based infrastructure that supports our semiconductor design and verification workflows. You will work closely with silicon engineering teams to ensure reliable, scalable, and efficient compute environments.

Requirements

3+ years of hands-on DevOps/Infrastructure engineering experience
Strong problem-solving skills with the ability to debug complex system issues
Solid operational knowledge of AWS Cloud services, including:
EC2 (instance management, AMIs, spot/on-demand strategies)
FSx (Lustre/NetApp ONTAP for high-performance storage)
VPC, Security Groups, IAM, and networking fundamentals
Experience with scripting languages such as Python, Bash, or similar
Familiarity with Infrastructure-as-Code tools (Terraform, CloudFormation, or Ansible)
Experience with CI/CD pipelines and version control systems (Git)
A proactive, self-motivated approach to identifying and solving infrastructure challenges
Strong communication skills to collaborate with cross-functional engineering teams
Ability to work in a fast-paced environment with competing priorities
Passion for automation and continuous improvement

Nice To Haves

Experience with AWS ParallelCluster or similar HPC cluster management tools
Background in the Semiconductor/EDA industry with understanding of:
EDA tool workflows (simulation, synthesis, place & route, verification)
License management and job scheduling (LSF, Slurm, SGE)
Debug scenarios specific to silicon design environments
Knowledge of container technologies (Docker, Singularity)
Experience with monitoring and observability tools (CloudWatch, Prometheus, Grafana)
AWS certifications (Solutions Architect, SysOps Administrator) are a plus
Experience with hybrid cloud architectures (on-prem + cloud)
Familiarity with cost optimization strategies for large-scale cloud deployments
Understanding of security best practices in regulated environments
Experience with AI based tools like claude-code or copilot a plus

Responsibilities

Design, deploy, and maintain cloud infrastructure on AWS to support silicon engineering workloads
Manage and optimize EC2 instances, FSx file systems, and related AWS services for high-performance computing needs
Implement and manage AWS ParallelCluster for provisioning and scaling compute clusters and partitions
Troubleshoot and resolve complex infrastructure issues across cloud and on-premises environments
Develop automation scripts and Infrastructure-as-Code (IaC) solutions to streamline operations
Collaborate with EDA tool administrators and silicon engineers to optimize workflows and resource utilization
Monitor system performance, implement alerting, and ensure high availability of critical infrastructure
Document processes, runbooks, and best practices for team knowledge sharing

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume