Storage Systems Administrator II

CrusoeSan Francisco, CA
10d$128,500 - $151,000

About The Position

At Crusoe, we are on a mission to align the future of computing with the future of the climate. As a Storage Systems Administrator II, you will be a key operator of the high-performance data layer for our vertically integrated AI cloud. This role focuses on the reliability and maintenance of our all-flash storage ecosystems—specifically VAST Data and Pure Storage—ensuring they deliver the high-speed data access required for massive-scale AI training. Working as part of the Storage Team, you will manage the daily health of our global storage footprint. Your work ensures that our sustainable GPU clusters have a stable, high-throughput data backbone, directly supporting the world's leading AI researchers.

Requirements

  • Technical Experience: 2–6 years of experience in Storage or Systems Administration, with a solid foundation in managing enterprise-grade storage arrays.
  • Hands-on Flash Experience: Direct experience with VAST Data or Pure Storage (FlashBlade/FlashArray) is highly preferred.
  • Linux Fundamentals: Strong proficiency with the Linux CLI, including a clear understanding of mounting file systems and basic network configuration.
  • Protocol Knowledge: Familiarity with high-performance protocols such as NFS (including NFS over RDMA), SMB, or NVMe-oF.
  • Scripting Ability: Ability to use Python or Bash to interact with APIs or automate repetitive system tasks.
  • Execution & Care: A detail-oriented approach to documentation and change management, ensuring petabyte-scale environments remain stable.

Nice To Haves

  • Experience using Pure1 or VAST VMS/Insight for monitoring and capacity planning.
  • Basic understanding of InfiniBand and RoCE networking.
  • Experience in a data center environment or with high-performance computing (HPC) workloads.

Responsibilities

  • Storage Operations: Manage the daily administration of VAST Data and Pure Storage environments, including volume provisioning, export management, and quota adjustments.
  • Health & Monitoring: Use tools like Grafana and Prometheus to monitor cluster health, tracking IOPS and latency to identify potential bottlenecks before they impact users.
  • Maintenance & Upgrades: Assist in executing non-disruptive software upgrades (VAST OS, Purity//FB) and hardware expansions to keep our infrastructure modern and secure.
  • Data Integrity: Implement and verify snapshot schedules and replication policies to ensure data durability and successful recovery points.
  • Troubleshooting: Resolve storage-related tickets and performance issues, collaborating with senior engineers and vendor support (VAST/Pure) to minimize downtime.
  • Task Automation: Write and maintain scripts (Python/Bash) to automate routine administrative tasks, such as reporting on capacity and streamlining user access.

Benefits

  • Competitive compensation
  • Restricted Stock Units
  • Paid time off & paid holidays
  • Comprehensive health, dental & vision insurance
  • Employer contributions to HSA account
  • Paid parental leave
  • Paid life insurance, short-term and long-term disability
  • Professional development & tuition reimbursement
  • Mental health & wellness support
  • Commuter benefits (parking & transit)
  • Cell phone stipend
  • 401(k) Retirement plan with company match up to 4% of salary
  • Volunteer time off

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service