Systems Specialist

Squarepoint CapitalNew York, NY
5d$200,000 - $250,000

About The Position

Squarepoint Services US LLC seeks a Systems Specialist for its New York, New York location. Duties: Manage the companies’ HPC Slurm clusters, scheduling and configuration. Deploy Prometheus and Grafana Metrics for real-time monitoring. Manage Python/Pandas and kdb/q data analysis for reporting to stakeholders and management efficiency of code, use of hardware. Deploy bespoke plugins in C to improve features for users. Deal with support issues within user code and optimize workloads for hardware. Optimize Nvidia GPU configuration for multi-node machine learning models.

Requirements

  • Must have a minimum of a Master’s degree or foreign equivalent in any STEM (Science, Technology, Engineering, or Math) field of study and 2 years of experience as a HPC Engineer, Systems Engineer/Specialist, Site Reliability Engineer, or related position for an investment/asset management organization.
  • Will also accept a Bachelor’s degree or foreign equivalent in the above disciplines and 5 years of progressive, post-baccalaureate experience as stated above.
  • Must have at least two (2) years of employment experience with each of the following required skills: Deploying Slurm Clusters and optimizing scheduling of high throughput workloads; Debugging system problems and optimizing kernels in Linux; Data analysis in Python – Pandas; Working with Nvidia GPUs, optimizing workloads for hardware and networking stacks; Using KDB/Q for data analysis and debugging user workflows; Live monitoring of clusters using Prometheus and Grafana.

Responsibilities

  • Manage the companies’ HPC Slurm clusters, scheduling and configuration.
  • Deploy Prometheus and Grafana Metrics for real-time monitoring.
  • Manage Python/Pandas and kdb/q data analysis for reporting to stakeholders and management efficiency of code, use of hardware.
  • Deploy bespoke plugins in C to improve features for users.
  • Deal with support issues within user code and optimize workloads for hardware.
  • Optimize Nvidia GPU configuration for multi-node machine learning models.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service