Pulse: Fine-Grained Hierarchical Hashing Index for Disaggregated Memory (HPCA 2026 - Main Conference)

Who

Guangyang Deng, Zixiang Yu, Zhirong Shen, Qiangsheng Su, Jiwu Shu

Track

HPCA 2026 Main Conference

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Tue 3 Feb 2026 15:10 - 15:30 at Collaroy - Memory Systems for Scalable Computing Chair(s): Alexandros Daglis

Abstract

By decoupling compute and memory resources into independent pools that are provisioned and managed separately, disaggregated memory (DM) is promising to break the scaling constraints for memory systems and improve resource utilization. However, it also comes with a new challenge to design a high-performance hashing index to manage the vast memory pool with weak computing power. In this paper, we reconsider this problem and find that existing hashing indexes for DM still experience two fundamental yet unresolved limitations: (i) amplifying traffic under high insertion concurrency, and (ii) introducing significantly high insertion latency, stemming mainly from the directory synchronization and item relocation in resizing process.

We resolve above limitations by designing Pulse, a fine-grained hierarchical hashing index for DM. Pulse comprises the following three design primitives. It proposes a multi-level index structure, which breaks the conventional flat directory into multiple sub-directories that are organized hierarchically, achieving fine-grained directory synchronization. Pulse also maintains a small portion of hashed keys in the memory pool, which aids in item relocation during resizing, thereby reducing resizing traffic and promising system stability. Pulse finally exploits operation parallelism by tailoring the doorbell batching mechanism with selective signaling. We conduct extensive experiments using a variety of benchmarks, showing that Pulse can improve 3.46× of the throughput and reduce 76.3% of the tail latency compared to state-of-the-art hashing indexes.

Guangyang Deng

Xiamen University

Zixiang Yu

Xiamen University

Zhirong Shen

Xiamen University

Qiangsheng Su

Xiamen University

Jiwu Shu

Xiamen University

Time Zone

The program is currently displayed in (GMT+11:00) Hobart.

Use conference time zone: (GMT+11:00) HobartSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Tue 3 Feb
Displayed time zone: Hobart change

14:10 - 15:30	Memory Systems for Scalable ComputingMain Conference at Collaroy Chair(s): Alexandros Daglis Georgia Tech

14:10 20m Talk		BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism Main Conference Suhas Vittal Georgia Tech, Moinuddin K. Qureshi Georgia Tech
14:30 20m Talk		RoMe: Row Granularity Access Memory System for Large Language Models Main Conference Hwayong Nam Seoul National University, Seungmin Baek Seoul National University, Jumin Kim Seoul National University, Michael Jaemin Kim Meta, Jung Ho Ahn Seoul National University Pre-print
14:50 20m Talk		HDPAT: Hierarchical Distributed Page Address Translation for Wafer-Scale GPUs Main Conference daoxuan xu William & Mary, Ying Li William & Mary, Yuwei Sun UIUC, Jie Ren William & Mary, Yifan Sun William&Mary
15:10 20m Talk		Pulse: Fine-Grained Hierarchical Hashing Index for Disaggregated Memory Main Conference Guangyang Deng Xiamen University, Zixiang Yu Xiamen University, Zhirong Shen Xiamen University, Qiangsheng Su Xiamen University, Jiwu Shu Xiamen University