HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Mon 2 Feb 2026 10:30 - 10:50 at Collaroy - Cache Coherence and Chiplet Interconnects Chair(s): Alberto Ros

Current multiprocessors that support the total store order (TSO) memory consistency model invariably use write-back (WB) cache-coherence protocols. When their hardware needs to issue write-through (WT) stores as in uncached operations, they deliver dismal performance: writes to main memory have to be fully serialized, often forcing the program to observe the full latency of a round-trip to memory.

To solve this problem, this paper presents a novel architecture that supports high-performance cache-coherent WT stores under TSO. The architecture, called PhasedStore, involves extending the store queue of the core and the directory. Individual WT stores in two phases, which allow them to fully overlap with other stores and still satisfy TSO.

PhasedStore is useful in environments that require a WT cache-coherence protocol. This can be the case in resilience-critical platforms where node failures should not cause the loss of shared program state, or platforms with CPUs and accelerators where programs follow a producer-consumer pattern. This paper evaluates PhasedStore in the first environment, namely a CXL-based distributed shared-memory platform where shared data in the program uses a WT protocol to enable recovery. Our evaluation shows that PhasedStore is very effective. Compared to using the conventional approach to implement WT under TSO, PhasedStore reduces the average execution time of a set of parallel applications by 1.88x.

Mon 2 Feb

Displayed time zone: Hobart change

09:50 - 11:10
Cache Coherence and Chiplet InterconnectsMain Conference at Collaroy
Chair(s): Alberto Ros University of Murcia
09:50
20m
Talk
$C^3$ : CXL Coherence Controllers for Heterogeneous Architectures
Main Conference
Anatole Lefort Technical University of Munich (TUM), David Schall Technical University of Munich, Nicolò Carpentieri Technical University of Munich, Julian Pritzi Technical University of Munich, Soham Chakraborty TU Delft, Nicolai Oswald NVIDIA, Pramod Bhatotia TU Munich
Pre-print
10:10
20m
Talk
Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation
Main Conference
Yanjing Wang National University of Defense Technology, Lizhou Wu National University of Defense Technology, Sunfeng Gao National University of Defense Technology, Yibo Tang National University of Defense Technology, Junhui Luo National University of Defense Technology, Zicong Wang National University of Defense Technology, Yang Ou National University of Defense Technology, Dezun Dong NUDT, Nong Xiao National University of Defense Technology & Sun Yat-sen University, Mingche Lai National University of Defense Technology
10:30
20m
Talk
PhasedStore: Supporting High-performance Write-through Cache-coherence Protocols under TSO
Main Conference
Burak Ocalan University of Illinois Urbana-Champaign, Chloe Alverti University of Illinois at Urbana-Champaign, Shashwat Jaiswal University of Illinois Urbana-Champaign, USA, Antonis Psistakis University of Illinois Urbana-Champaign, David Koufaty Unaffiliated, Suyash Mahar UC San Diego, Steven Swanson University of California San Diego, Josep Torrellas University of Illinois at Urbana-Champaign
10:50
20m
Talk
Deadlock-Free Bridge Module for Inter-Chiplet Communication in Open Chiplet Ecosystem
Main Conference
Zhiqiang Chen National University of Defense Technology, Wenwen Fu National University of Defense Technology, Yongwen Wang National University of Defense Technology, Hongwei Zhou National University of Defense Technology