Multi-GPU compute nodes are the backbone of exascale supercomputers and AI factories. As a leading programming model for node-level parallelism, OpenMP must evolve to harness these complex architectures. Recent advancements in OpenMP 6.0, alongside ongoing efforts for Python integration, offer novel optimization opportunities that remain largely unexplored in production compilers and runtimes. This project aims to investigate these opportunities by leveraging the recent transparent clause and taskgraph constructs within the LLVM ecosystem. This research will drive impactful performance enhancements for the LLVM OpenMP runtime and compiler, specifically through GPU command batching and fusion. While these techniques are industry standards in AI/ML frameworks like TensorFlow and PyTorch, they are currently underutilized in traditional scientific simulation environments. Bridging this gap will ensure OpenMP remains a performant, portable, and reliable choice for the next generation of computational science. Education and Experience Requirements The entirety of the appointment must be conducted within the United States. Applicants must be: o Currently enrolled in undergraduate or graduate studies at an accredited institution. o Graduated from an accredited institution within the past 3 months; or o Actively enrolled in a graduate program at an accredited institution. Must be 18 years or older at the time the appointment begins. Must possess a cumulative GPA of 3.0 on a 4.0 scale. If accepting an offer, candidates may be required to complete pre-employment drug testing based on appointment length. All students remain subject to applicable drug testing policies. Must complete a satisfactory background check.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Intern
Education Level
No Education Listed
Number of Employees
1,001-5,000 employees