HPCA 2026
Sat 31 January - Wed 4 February 2026 Sydney, Australia
co-located with HPCA/CGO/PPoPP/CC 2026
Tue 3 Feb 2026 10:10 - 10:30 at Coogee - Wafer-Scale Systems for Large Models Chair(s): Hyesoon Kim, Hyesoon Kim

The deployment of large language models (LLMs) imposes significant demands on computing, memory, and communication resources. Wafer-scale chips, leveraging advanced packaging technologies, enable high-density integration of computing and memory resources while offering high die-to-die (D2D) communication bandwidth, presenting a promising architectural solution to meet these requirements arising from LLMs. However, the unprecedented chip area introduces significant architectural design complexities. Wafer-scale chips feature a multi-level architecture, including wafer, die, and core levels, involving critical parameter design considerations and trade-offs. Moreover, this also introduces major challenges for LLM service scheduling, where fully leveraging the advantages of wafer-scale technology while mitigating its limitations is essential to transforming massive hardware resources into actual performance. Unfortunately, methods to address these challenges remain scarce.

To bridge this gap, we propose FACE, a co-exploration framework for multi-level architecture and scheduling. We first define a highly configurable and general hardware template to systematically explore the optimal architecture and micro-architecture parameters. Leveraging the fine-grained control and high interconnect bandwidth of wafer-scale chips, FACE implements an LLM scheduling strategy that achieves fully overlapped prefill-decode execution and efficient KV cache management, minimizing prefill-decode interference while maximizing hardware resource utilization. Our evaluation demonstrates that FACE can achieve an average overall performance improvement of 3.68$\times$ across various LLM models and datasets compared to state-of-the-art (SOTA) LLM serving systems on wafer-scale chips. Moreover, FACE provides valuable insights into wafer-scale multi-level architecture design and LLM workload execution. \textit{The FACE framework will be open-sourced.}

Tue 3 Feb

Displayed time zone: Hobart change

09:50 - 11:10
Wafer-Scale Systems for Large ModelsMain Conference at Coogee
Chair(s): Hyesoon Kim Georgia Institute of Technology, Hyesoon Kim Georgia Institute of Technology
09:50
20m
Talk
WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip
Main Conference
Huizheng Wang Tsinghua University, Zichuan Wang Tsinghua University, Hongbin Wang Tsinghua University, Jingxiang Hou Tsinghua University, Taiquan Wei Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:10
20m
Talk
FACE: Fully PD Overlapped Scheduling and Multi-Level Architecture Co-Exploration on Wafer
Main Conference
Zheng Xu Tsinghua University, Dehao Kong Tsinghua University, Jiaxin Liu Tsinghua University, Dingcheng Jiang Tsinghua University, Xu Dai Shanghai Artificial Intelligence Laboratory, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:30
20m
Talk
TEMP: A Memory Efficient Physical-aware Tensor Partition-Mapping Framework on Wafer-scale Chips
Main Conference
Huizheng Wang Tsinghua University, Taiquan Wei Tsinghua University, Zichuan Wang Tsinghua University, Dingcheng Jiang Tsinghua University, Qize Yang Tsinghua University, Jiaxin Liu Tsinghua University, Jingxiang Hou Tsinghua University, Chao Li Shanghai Jiao Tong University, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University
10:50
20m
Talk
MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference
Main Conference
Xinru Tang Tsinghua University, Jingxiang Hou Tsinghua University, Dingcheng Jiang Tsinghua University, Taiquan Wei Tsinghua University, Jiaxin Liu Tsinghua University, Jinyi Deng Tsinghua University, Huizheng Wang Tsinghua University, Qize Yang Tsinghua University, Haoran Shang Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University