SnakeMan: Applying Relation-centric Notation to Model and Optimize Data Swizzle in the Cache of Modern NPU
This program is tentative and subject to change.
Swizzle is a data access pattern optimization technique by reorganizing the execution order of computational tasks to improve the cache locality in modern NPUs. Existing analysis and optimization techniques lack support for swizzle-aware modeling on NPUs and fail to effectively capture cache behavior across diverse swizzle configurations. To this end, we propose SnakeMan, a framework for modeling and optimizing swizzle. We introduce a relation-centric notation to characterize different cache access patterns, thus exploring wider swizzle space. Then, we propose a hybrid performance model based on set theory for cache analysis. The proposed performance model uses an analytical approach to quantify cache miss behavior under unsaturated cache conditions (\textit{non-saturated misses}), and employs a simulation method combined with an early exiting mechanism to model cache behavior under saturated cache conditions (\textit{saturated misses}). Experimental evaluations demonstrate that SnakeMan achieves an average modeling accuracy of 93.5% in transfer latency compared to real-world hardware. Evaluation on a variety of DNNs shows that SnakeMan outperforms existing tensor program optimizers by up to 1.5$\times$ on A100 GPUs. We also demonstrate NPU cache size optimization based on SnakeMan.
This program is tentative and subject to change.
Tue 3 FebDisplayed time zone: Hobart change
11:30 - 12:50 | |||
11:30 20mTalk | Athena: Synergizing Data Prefetching and Off-Chip Prediction via Online Reinforcement Learning Main Conference Zhenrong Lang ETH Zürich, Rahul Bera ETH Zurich, Caroline Hengartner ETH Zürich, Konstantinos Kanellopoulos ETH Zurich, Rakesh Kumar NTNU, Mohammad Sadrosadati ETH Zürich, Onur Mutlu ETH Zurich | ||
11:50 20mTalk | Streamlined On-Chip Temporal Prefetching Main Conference | ||
12:10 20mTalk | Intermittence-Aware Cache Compression Main Conference Gan Fang Purdue University, Jianping Zeng Arizona State University, Yuchen Zhou Purdue University, Changhee Jung Purdue University, USA | ||
12:30 20mTalk | SnakeMan: Applying Relation-centric Notation to Model and Optimize Data Swizzle in the Cache of Modern NPU Main Conference Hanyu Zhang Zhejiang University, Fangxu Guo Zhejiang University, Liqiang Lu Zhejiang University, Long Wang Huawei Technologies, Yunfei Du Huawei Technologies, Zhe Wang Huawei Technologies, Jinghan Zhang Huawei Technologies, Jie Zhang Peking University, Chenli Xue Zhejiang University, Chengpeng Wu Zhejiang University, Ziyi Zhang Zhejiang University, Eric Liang Peking University, Size Zheng ByteDance, Jianwei Yin Zhejiang University | ||