The objective of this project is to develop an integrated multi-knob performance-energy trade-off model for control of multiple knobs (node power limits, GPU/CPU/Uncore DVFS ...) in order to overcome the limitations of single knob approaches. This is done by establishing lightweight response models which describe how run time and energy vary based on different knob combinations at different phases of workload. These are then used to identify the best configuration based upon a combination of performance and facility power constraints. The model incorporates node-to-node variability, allowing a non-uniform system-wide power assignment, thereby improving load balancing and total system throughput. The effectiveness of the developed solution will be assessed with representative HPC and large-scale AI workloads.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Intern
Education Level
No Education Listed