ARIADNE: Adaptive UVM Management for Efficient GPU Memory Oversubscription (HPCA 2026 - Main Conference)

Who

Hyunkyun Shin, Seongtae Bang, Hyungwon Park, Daehoon Kim

Track

HPCA 2026 Main Conference

This program is tentative and subject to change.

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 4 Feb 2026 11:50 - 12:10 at Cronulla - GPU Memory Management and Multi-Chiplet Systems

Abstract

Unified Virtual Memory (UVM) simplifies GPU programming and supports memory oversubscription, but suffers from severe performance degradation under high memory pressure due to page fault overhead and thrashing. Existing approaches such as prefetching, Access counter-based migration, and dynamic Zero-copy offer limited benefits and often require hardware or compiler modifications, undermining UVM’s portability and ease of deployment. We present ARIADNE, a runtime UVM management framework that preserves UVM’s GPU memory abstraction while ensuring high and robust performance under memory oversubscription. ARIADNE is guided by three principles: (1) pipelined fault handling to hide migration latency, (2) Sharing Degree, a runtime metric that captures thread-level access locality without requiring hardware or compiler changes, to inform placement decisions, and (3) dynamic placement of memory regions between GPU memory and Zero-copy based on real-time access patterns. Implemented entirely within NVIDIA’s open-source UVM driver, ARIADNE requires no recompilation or hardware modifications and applies transparently to any executable or closed-source GPU UVM applications. Our experimental results show that ARIADNE delivers average speedups of 1.9×, 5.0×, and 4.8× over a state-of-the-art method at 130%, 175%, and 300% oversubscription, respectively, while effectively preventing thrashing and maintaining near-linear performance scaling.

Hyunkyun Shin

Yonsei University

Seongtae Bang

DGIST

Hyungwon Park