HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026

This program is tentative and subject to change.

Tue 3 Feb 2026 15:50 - 16:10 at Cronulla - Domain Specific Accelerators

Modern processors are increasingly adopting tensor cores as key computational units. Compared to existing designs for dense and structured sparsity, recent dual-side sparse tensor cores support general sparsity. However, existing methods still face limitations on genericity (incomplete sparse kernel support prevents broad applicability) and performance (outer-product/row-row schemes yield unsatisfactory hardware utilisation, data reuse, and energy efficiency).

In this paper, we propose Uni-STC, a unified sparse tensor core that delivers high-performance dataflows for four key sparse kernels: sparse matrix-vector multiplication (SpMV), sparse matrix-sparse vector multiplication (SpMSpV), sparse matrix-multiple vector multiplication (SpMM), and sparse general matrix-matrix multiplication (SpGEMM). To efficiently support these diverse sparse workloads, we introduce BBC, a unified sparse format co-designed with Uni-STC’s dataflow. We then design Uni-STC’s architecture supporting (1) fine-grained task partitioning to improve resource utilization, (2) parallel sparse-tile processing for enhanced data reuse, and (3) a dynamic network to reduce intermediate data movement and energy consumption. Evaluated across 2893 SuiteSparse and 302 DLMC matrices, Uni-STC demonstrates significant improvements over state-of-the-art sparse tensor cores in both performance and energy efficiency.

This program is tentative and subject to change.

Tue 3 Feb

Displayed time zone: Hobart change

15:50 - 17:10
Domain Specific AcceleratorsMain Conference at Cronulla
15:50
20m
Talk
Uni-STC: Unified Sparse Tensor Core
Main Conference
Haocheng Lian China University of Petroleum-Beijing, Qiyue Zhang China University of Petroleum-Beijing, Xinran Zhao China University of Petroleum-Beijing, Meichen Dong China University of Petroleum-Beijing, Yijie Nie China University of Petroleum-Beijing, Zhengyi Zhao China University of Petroleum-Beijing, Junzhong Shen National University of Defense Technology, Wei Guo National University of Defense Technology, Chun Huang National University of Defense Technology, Bingcai Sui National University of Defense Technology, Weifeng Liu China University of Petroleum-Beijing
16:10
20m
Talk
AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving
Main Conference
Xinkai Wang Shanghai Jiao Tong University, Chao Li Shanghai Jiao Tong University, Yiming Zhuansun Shanghai Jiao Tong University, Jinyang Guo Shanghai Jiao Tong University, Xiaofeng Hou Shanghai Jiao Tong University, Jing Wang Shanghai Jiao Tong University, Luping Wang Alibaba Group, Weigao Chen Alibaba Group, Cheng Huang Alibaba Group, Guodong Yang Alibaba Group, Liping Zhang Alibaba Group, Minyi Guo Shanghai Jiao Tong University
16:30
20m
Talk
DRACO: A Hardware-Efficient Robot Rigid Body Dynamics Accelerator with Precision-Aware Quantization Framework
Main Conference
Xingyu Liu The Hong Kong University of Science and Technology, Jiawei Liang The Hong Kong University of Science and Technology, Yipu Zhang The Hong Kong University of Science and Technology, Linfeng Du The Hong Kong University of Science and Technology, Chaofang Ma The Hong Kong University of Science and Technology, Hui Yu Hong Kong University of Science and Technology, Xu Jiang University of Electronic Science and Technology of China, Wei Zhang The Hong Kong University of Science and Technology
16:50
20m
Talk
REASON: Accelerating Probabilistic Logical Reasoning for Neuro-Symbolic Cognitive Intelligence
Main Conference
Zishen Wan Georgia Institute of Technology, Che-Kai Liu Georgia Institute of Technology, Jiayi Qian Georgia Institute of Technology, Hanchen Yang Georgia Institute of Technology, Arijit Raychowdhury Georgia Institute of Technology, Tushar Krishna Georgia Institute of Technology