Research Aide– LCF – Moghadampanah, Mona– 2.2.26

Argonne National LaboratoryLemont, IL
1d$31 - $47

About The Position

Multi-Modal Large Language Models (MLLMs) extend the capabilities of LLMs by incorporating additional modalities, such as images, video, and audio, to enable advanced capabilities, including perception-grounded reasoning, visual question answering (VQA), captioning, and scene understanding. Unlike pure-text LLMs, MLLMs introduce an additional stage, the visual encoding stage, which transforms multimodal inputs into embeddings consumed by the language model’s prefill and decoding stages. This project aims to gain a deeper understanding of these inefficiencies and analyze the energy and performance characteristics of MLLM inference. We plan to evaluate four state-of-the-art MLLMs (InternVL3-8B, LLaVA-1.5-7B, LLaVA-OneVision-7B, and Qwen2.5-VL-7B) in controlled multi-GPU systems: Aurora and Polaris, and propose a system-level performance-energy tradeoff model that explicitly accounts for the heterogeneous behavior of different inference stages. The key objectives of this work include: Characterizing the energy and performance bottlenecks of MLLM inference pipelines. Analyzing the energy and performance impact of different input modalities and modality-specific features (e.g., images, video, and audio). Designing workload-aware power management strategies that employ system-level power control mechanisms such as dynamic voltage and frequency scaling (DVFS) and power capping to reduce energy consumption while meeting service-level objectives (SLOs). Demonstrating practical energy savings for real-world multimodal inference deployments without compromising latency or throughput requirements.

Requirements

  • The entirety of the appointment must be conducted within the United States.
  • Applicants must be: ‒ Currently enrolled in undergraduate or graduate studies at an accredited institution. ‒ Graduated from an accredited institution within the past 3 months; or ‒ Actively enrolled in a graduate program at an accredited institution.
  • Must be 18 years or older at the time the appointment begins.
  • Must possess a cumulative GPA of 3.0 on a 4.0 scale.
  • If accepting an offer, must pass a screening drug test
  • Must complete a satisfactory background check.

Responsibilities

  • Characterizing the energy and performance bottlenecks of MLLM inference pipelines.
  • Analyzing the energy and performance impact of different input modalities and modality-specific features (e.g., images, video, and audio).
  • Designing workload-aware power management strategies that employ system-level power control mechanisms such as dynamic voltage and frequency scaling (DVFS) and power capping to reduce energy consumption while meeting service-level objectives (SLOs).
  • Demonstrating practical energy savings for real-world multimodal inference deployments without compromising latency or throughput requirements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service