Senior GenAI & High Performance Computing (HPC) Delivery Engineer

Dell Technologies•Austin, TX

1d•Onsite

About The Position

Senior GenAI & High Performance Computing (HPC) Delivery Engineer Dell Technologies has delivered HPC solutions for 25+ years, including support for Bright Cluster Manager (now NVIDIA BCM) since 2011. Today, Dell is NVIDIA’s preferred partner for GenAI Factory systems, using Dell GenAI PowerEdge XE servers and NVIDIA NVAIE to help customers build and scale end to end GenAI and High-Performance Computing environments. Join us to do the best work of your career and make a profound social impact as a Senior GenAI & High Performance Computing (HPC) Delivery Engineer on our Service Delivery Team in Austin, Texas. What you’ll achieve We’re seeking a Senior GenAI & HPC Engineer with deep experience in GPU accelerated systems, Linux performance tuning, and benchmarking. This role is highly hands on and customer facing, supporting onsite deployments across the U.S. for advanced HPC and GenAI solutions. You will work as a part of a team to help build, integrate, and test some of the world’s largest multi GPU systems, benchmark them using industry standard tools, and deliver the next generations of AI and HPC infrastructure.

Requirements

7+ years with HPC or GenAI clusters, GPU based systems, AI infrastructure, or related fields
Deep hands on experience with GPU deployment, configuration, and multi-node testing using NVIDIA Base Command Manager
Proficiency with benchmarking tools: HPL, STREAM, NCCL, RCCL, MxP, OSU Microbenchmarks
Red Hat certification (RHCSA/RHCE) or 7+ years of relevant RH distros experience
Experience with GenAI/HPC networking (InfiniBand and/or RoCE)
Experience working in Linux based parallel computing environments at scale
Experience with containers/orchestration (Docker, Singularity/Apptainer, Kubernetes, Slurm)
Ability to travel up to 70% of the time across the U.S. as needed for projects
Strong customer facing and communication skills

Nice To Haves

Bachelor’s degree
NVIDIA certifications (NCA, NCE, DGX)
Experience with NVIDIA UFM, Infiniband, and SpectrumX fabrics
Exposure to hybrid cloud or GPU cloud environments
Experience with GPU observability/performance profiling tools

Responsibilities

Deploy, configure, and validate GPU accelerated compute clusters for AI, ML, and HPC with NVIDIA Base Command Manager (Warewulf and OpenHPC knowledge are a plus)
Perform benchmarking with HPL GPU, HPL MxP, STREAM, NCCL, RCCL, OSU Microbenchmarks, and related tools
Produce as-built documentation, performance reports, and share best practices amongst the team.
Configure and secure RHEL, Ubuntu, Rocky for GenAI or HPC workloads
Work directly with customers onsite (travel both regionally and across the U.S.)

Benefits

Dell is committed to fair and equitable compensation practices.
The salary range for this position is $153,850 to $199,100.
Your life. Your health. Supported by your benefits.
You can explore the overall benefits experience that awaits you as a Dell Technologies team member — right now at MyWellatDell.com

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume