API Interception-Based GPU Virtualization for Containerized HPC Workloads in Cloud Environments
Abstract
In recent years, with the development of cloud-native technologies such as containers and Kubernetes, high-performance computing (HPC) tasks using GPUs have gradually migrated to container cloud environments, introducing new challenges in fine-grained GPU resource management. Currently, the native Kubernetes framework lacks support for allocating fractional GPU resources to containers, permitting only exclusive access to entire physical GPUs, which results in low cluster-wide GPU utilization. To enable efficient GPU sharing for HPC tasks in containerized environments, GPU virtualization — allocating precise amounts of GPU compute and memory resources to different containers with isolation guarantees — becomes essential. However, current GPU virtualization technologies remain in their infancy and exhibit the following limitations: (1) NVIDIA GPUs enforce strict closed-source policies at the driver and lower layers, forcing existing solutions to rely on reverse engineering approaches for virtualization; (2) current implementations fail to address GPU idle cycles during HPC task execution, leading to computational resource wastage. To address these issues, this paper proposes a GPU virtualization system for container cloud environments. The key contributions include: (a) profiling the workflow and invocation mechanisms of CUDA-based HPC tasks (e.g., deep learning training) to characterize their GPU memory and compute usage patterns; (b) developing formal models for HPC resource utilization processes; and (c) implementing a resource isolation and quota mechanism via API interception and forwarding. Experimental results demonstrate that our system achieves superior virtualization efficiency with lower overhead. The proposed adaptive elastic GPU allocation method yields 37% higher average GPU utilization and 26% greater cluster throughput under heavy loads compared to static allocation in KubeShare.DOI:
https://doi.org/10.31449/inf.v50i12.12099Downloads
Published
How to Cite
Issue
Section
License
Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.
All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.
Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.







