HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Events (21 results)

Welcome Reception

Catering When: Sun 1 Feb 2026 18:00 - 20:00

All attendees registered for the main conference are invited to attend the welcome reception from 18:00 on Sunday evening, where there will be great food and drink and an opportunity to engage with the vibrant HPCA/CGO/PPoPP/CC …

Oracle Parfait – Scaling Vulnerability Detection from Enterprise Systems to Cloud-Scale Systems and Beyond

Plenary Keynotes When: Tue 3 Feb 2026 08:45 - 09:45 People: Cristina Cifuentes

… to a DevSecOps model where security gets integrated at all levels of the software process …

Compiler 2.0: Building the Next Generation Compilers with Machine Learning

Plenary Keynotes When: Mon 2 Feb 2026 08:45 - 09:45 People: Saman Amarasinghe

… , complex vector instructions, and specialized accelerators have all pushed more …

MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Xinru Tang, Jingxiang Hou, Dingcheng Jiang, Taiquan Wei, Jiaxin Liu, Jinyi Deng, Huizheng Wang, Qize Yang, Haoran Shang, Chao Li, Yang Hu, Shouyi Yin

… parallelism (EP) to alleviate memory bottleneck, which introduces all-to-all … GPU clusters, high-overhead cross-node communication makes all-to-all expensive … provide a unified high-performance network connecting all devices, presenting …

Enterprise Class On-Chip Accelerator Integration

Industry Track When: Tue 3 Feb 2026 17:15 - 17:35 People: Deanna Berger, Alper Buyuktosunoglu, Craig Walters, Robert Sonnelitter, Hailey Nicholson, Ashraf ElSharif, Yamil Rivera, Avery Francois, Cedric Lichtenau, Jason Kohl

… with a sustained processor utilization of over 90% under all workload conditions … with an integrated multi-tier unified cache hierarchy all within one chip. The processor chip design leverages a unique approach to ensure all elements work in unison …

I-POP: Ignite Positive Prefetchers

Main Conference When: Tue 3 Feb 2026 10:50 - 11:10 People: Yiquan Lin, Wenhai Lin, Yiquan Chen, Jiexiong Xu, Shishun Cai, Jiarong Ye, Zonghui Wang, Wenzhi Chen

… prefetchers for issuing requests, but they all face limitations. Specifically, existing … prefetcher’s PE, and the Control Engine, which dynamically manages all

FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing

Main Conference When: Mon 2 Feb 2026 16:30 - 16:50 People: Yuzhe Fu, Changchun Zhou, Hancheng Ye, Bowen Duan, Qiyu Huang, Chiyue Wei, Cong Guo, Hai "Helen" Li, Yiran Chen

… ) block-parallel point operations that decompose and parallelize all point …

Protean: A Programmable Spectre Defense

Main Conference When: Wed 4 Feb 2026 10:30 - 10:50 People: Nicholas Mosier, Hamed Nemati, John C. Mitchell, Caroline Trippel

… We present the Protean Spectre defense—the first to be altogether comprehensive, covering all side-channels and speculation; programmer-transparent, requiring no source modifications; and programmable, tailoring its hardware protections …

GustavSNN: Unleashing the Power of Gustavson's Algorithm on SNN Acceleration with Column-Parallel Tick-Batch Dataflow

Main Conference When: Tue 3 Feb 2026 15:10 - 15:30 People: Sangwoo Hwang, Donghun Lee, Jahyun Koo, Jaeha Kung

… this by employing tick-batch techniques, which process all timesteps within a layer before …

GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping

Main Conference When: Tue 3 Feb 2026 17:15 - 17:35 People: Julien Eudine, Chu Li, Zhuo Cheng, Renzo Andri, Onur Mutlu, Can Firtina, Mohammad Sadrosadati, Nika Mansouri Ghiasi, Konstantina Koliogeorgi, Anirban Nag, Arash Tavakkol, Haiyu Mao, Shai Bergman, Ji Zhang

… CPU-based and 1.41$\times$ compared to hardware-based read mappers, all while …

VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG

Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Junkyum Kim, Divya Mahajan

… consistently expands the SLO-compliant request rate range across all tested …

LEGO: Supporting LLM-enhanced Games with One Gaming GPU

Main Conference When: Wed 4 Feb 2026 12:30 - 12:50 People: Han Zhao, Weihao Cui, Zeshen Zhang, Wenhao Zhang, Jiangtong Li, Quan Chen, Youmin Chen, Pu Pang, Zijun Li, Zhenhua Han, Yuqing Yang, Minyi Guo

… . Evaluations on an Nvidia RTX 4090 show that LEGO meets latency targets in all

QuCo: Efficient and Flexible Hardware-Driven Automatic Configuration of Tile Transfers in GPUs

Main Conference When: Wed 4 Feb 2026 10:50 - 11:10 People: Nicolas Meseguer, daoxuan xu, Yifan Sun, Michael Pellauer, José L. Abellán, Manuel E. Acacio

… , and synchronization primitives, all of which are hardware-specific and workload …

Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing

Main Conference When: Mon 2 Feb 2026 16:10 - 16:30 People: Xiaotong Huang, He Zhu, Tianrui Ma, Yuxiang Xiong, Fangxin Liu, Zhezhi He, Yiming Gan, Zihan Liu, Jingwen Leng, Yu Feng, Minyi Guo

… and 241.1$\times$ energy savings over state-of-the-art accelerators, all

Cyclone: Designing Efficient and Highly Parallel QCCD Architectural Codesigns for Fault Tolerant Quantum Memory

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Sahil Khan, Abhinav Anand, Kenneth R. Brown, Jonathan M. Baker

… Modular trapped-ion quantum computing hardware, known as Quantum Charge Coupled Devices (QCCDs) require shuttling operations in order to maintain effective all-to-all connectivity. Each module or trap can perform only one operation …

BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism

Main Conference When: Tue 3 Feb 2026 14:10 - 14:30 People: Suhas Vittal, Moinuddin K. Qureshi

… policy that works well across all the workloads. We develop a hybrid policy (BARD …

Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation

Main Conference When: Mon 2 Feb 2026 10:10 - 10:30 People: Yanjing Wang, Lizhou Wu, Sunfeng Gao, Yibo Tang, Junhui Luo, Zicong Wang, Yang Ou, Dezun Dong, Nong Xiao, Mingche Lai

… of modeling all CXL sub-protocols and device types. CXLSim has been rigorously …

WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip

Main Conference When: Tue 3 Feb 2026 09:50 - 10:10 People: Huizheng Wang, Zichuan Wang, Hongbin Wang, Jingxiang Hou, Taiquan Wei, Chao Li, Yang Hu, Shouyi Yin

… , existing approaches all fall short in addressing these challenges.

To bridge …

SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense

Main Conference When: Mon 2 Feb 2026 11:50 - 12:10 People: Moinuddin K. Qureshi

… to the subarray before all rows are guaranteed to be refreshed, thus providing {\em Blast …

Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models

Main Conference When: Mon 2 Feb 2026 09:50 - 10:10 People: Chiyue Wei, Cong Guo, Junyao Zhang, Haoxuan Shan, Yifan Xu, Ziyue Zhang, Yudong Liu, Qinsi Wang, Changchun Zhou, Hai "Helen" Li, Yiran Chen

… -level redundancy removal via motion-aware matching. All concentration steps …

MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT

Main Conference When: Mon 2 Feb 2026 11:30 - 11:50 People: Hritvik Taneja, Ali Hajiabadi, Michele Marazzi, Kaveh Razavi, Moinuddin K. Qureshi

… In-DRAM Rowhammer mitigation requires three resources: space (to track aggressor rows), time (to perform mit- igation), and energy (to refresh victim rows). An ideal in-DRAM mitigation must minimize all three overheads. Recent …