ARIADNE: Adaptive UVM Management for Efficient GPU Memory Oversubscription
This program is tentative and subject to change.
Unified Virtual Memory (UVM) simplifies GPU programming and supports memory oversubscription, but suffers from severe performance degradation under high memory pressure due to page fault overhead and thrashing. Existing approaches such as prefetching, Access counter-based migration, and dynamic Zero-copy offer limited benefits and often require hardware or compiler modifications, undermining UVM’s portability and ease of deployment. We present ARIADNE, a runtime UVM management framework that preserves UVM’s GPU memory abstraction while ensuring high and robust performance under memory oversubscription. ARIADNE is guided by three principles: (1) pipelined fault handling to hide migration latency, (2) Sharing Degree, a runtime metric that captures thread-level access locality without requiring hardware or compiler changes, to inform placement decisions, and (3) dynamic placement of memory regions between GPU memory and Zero-copy based on real-time access patterns. Implemented entirely within NVIDIA’s open-source UVM driver, ARIADNE requires no recompilation or hardware modifications and applies transparently to any executable or closed-source GPU UVM applications. Our experimental results show that ARIADNE delivers average speedups of 1.9×, 5.0×, and 4.8× over a state-of-the-art method at 130%, 175%, and 300% oversubscription, respectively, while effectively preventing thrashing and maintaining near-linear performance scaling.
This program is tentative and subject to change.
Wed 4 FebDisplayed time zone: Hobart change
11:30 - 12:50 | |||
11:30 20mTalk | Exploration of LLM Workload Reliability based on di/dt effects and Voltage Droops Main Conference Zhixing Jiang University of Texas at Austin, Justin Garrigus University of Texas at Austin, Allison Seigler University of Texas at Austin, Ethan Syed University of Texas at Austin, Yan-Lun Huang University of Texas at Austin, Mehdi Sadi Advanced Micro Devices, Tawfik Rahal-Arabi Advanced Micro Devices, Lizy John University of Texas, Austin | ||
11:50 20mTalk | ARIADNE: Adaptive UVM Management for Efficient GPU Memory Oversubscription Main Conference Hyunkyun Shin Yonsei University, Seongtae Bang DGIST, Hyungwon Park DGIST, Daehoon Kim Yonsei University | ||
12:10 20mTalk | LRM-GPU: Alleviating Synchronization Overhead for Multi-Chiplet GPU Architecture Main Conference Baiqing Zhong Sun Yat-Sen University, Zhirong Ye Sun Yat-Sen University, Xiaojie Li Sun Yat-Sen University, Peilin Wang Sun Yat-Sen University, Haiqiu Huang Sun Yat-Sen University, Zhaolin Li Tsinghua University, Zhiyi Yu Sun Yat-sen University, Mingyu Wang Sun Yat-Sen University | ||
12:30 20mTalk | LEGO: Supporting LLM-enhanced Games with One Gaming GPU Main Conference Han Zhao Shanghai Jiao Tong University, Weihao Cui Shanghai Jiao Tong University, Zeshen Zhang Tongji University, Wenhao Zhang Shanghai Jiao Tong University, Jiangtong Li Tongji University, Quan Chen Shanghai Jiao Tong University, China, Youmin Chen Shanghai Jiao Tong University, Pu Pang Shanghai Jiao Tong University, Zijun Li Shanghai Jiao Tong University, Zhenhua Han The University of Hong Kong, Yuqing Yang Microsoft Research, Minyi Guo Shanghai Jiao Tong University | ||