The 32nd IEEE International Symposium on High-Performance Computer Architecture (HPCA) will be held in Sydney, Australia in 2026. HPCA is a high-impact premier venue for presenting research results on a wide range of Computer Architecture topics. Some topics of interest are listed below, but we encourage authors to contact the program chairs (Aamer Jaleel and Mattan Erez) if they have a question regarding topic fit:
- Processor, memory, and storage systems architecture and microarchitecture
- Interconnection networks and network interface architecture
- Domain-specific architectures and accelerators
- FPGA, CGRA, and reconfigurable systems
- Near-/in-memory computing
- Cloud, datacenter, cluster/distributed computer systems
- Compilers/OS/runtimes as they relate to Computer Architecture
- IoT, mobile, edge, and embedded architectures
- Effects of circuits or technology on architecture (3D/chiplets/interposer/wafer-scale)
- Architecture modeling and simulation methodologies
- Architectures using quantum, superconducting, and emerging technologies
- Reliability/fault tolerance as they relate to Computer Architecture
- Security and privacy as they relate to Computer Architecture
- Evaluation and measurement of real computing systems
- Verification, testing, and correctness as they relate to Computer Architecture
- Power and energy as they relate to Computer Architecture
- Sustainable computing as they relate to Computer Architecture
HPCA 2026 features a separate Industry Track with a separate call for papers. The goal of the HPCA Industry Track is to publish papers that are written by industry authors and whose content relates to industrial products/processes.
Important Dates
| Milestone | Deadline |
|---|---|
| Abstract Submission | July 25, 2025 23:59 UTC / 19:59 EDT |
| Paper Submission | August 1, 2025 23:59 UTC / 19:59 EDT |
| Revision/Rebuttal Period | October 7 – 20, 2025 |
| Notification | November 7, 2025 |
| Final Papers Due | December 4, 2025 |
There are several important aspects for you to consider before submitting to HPCA that all paper submissions must adhere to. These are detailed below.
Final Submission Mindset
Submitting papers for review that are not yet complete and polished abuses the review process and is disrespectful to the Program Committee and the Computer Architecture community. All authors must affirm that their submission is, to the best of their abilities, complete, polished, and ready for comprehensive review.
Furthermore, if all reviewers of a submission find that the submission is lacking in this regard, detailed reviewer feedback will be withheld from the authors. In other words, there is a cost to submitting unfinished work – it may “poison” future reviews without the authors gaining any feedback.
Reviewer Continuity Initiative
HPCA 2026 is considering an initiative to maintain reviewer continuity for submitted papers that were rejected from MICRO 2025 on an opt-in basis. The goal for this initiative is to improve the fairness of the overall review process in Computer Architecture. It is in the best interest of authors and the community to have papers undergo a consistent revision process. However, please do not feel pressured to participate in this initiative; reviewers will not be made aware of your choice implicitly or explicitly.
If all authors of a submission opt in to the “reviewer continuity initiative”, reviewers who had previously reviewed a version of the submission for MICRO 2025 may be explicitly assigned to review the submission (PC members also must opt in to this initiative). If the authors do not opt in for “reviewer continuity”, reviewers of previous submissions may still be assigned to review the submitted version of the paper based on reviewer expertise, reviewer load balance, and other similar agnostic criteria.
Revision Letters
If a submission has been previously reviewed and rejected from another venue or conference and is now being submitted to HPCA 2026, the authors must provide a letter explaining how the paper has been revised for this current submission, regardless of whether the authors opt to participate in “reviewer continuity” or not.
Authors may choose to only permit the PC Chairs to have access and knowledge of this letter. Authors who opt in for reviewer continuity implicitly make their revision letter available to reviewers who reviewed a prior version for MICRO 2025. Note that it is not a requirement that papers be revised before being submitted to HPCA, but we expect a revision letter nonetheless.
Abstract Registration and Final Submission
Authors must register an abstract one week before the paper submission deadline. The purpose is to ensure that authors are fully aware of all submission requirements, topic selection, conflict of interest registration, etc. All information regarding the submission may be edited up to final submission time but may not change once the review process has started. In particular, authorship at submission time must match that of the final publication, if accepted.
Formatting and Submission Instructions
Detailed formatting and submission instructions will be made available on the submission site (hpca2026.hotcrp.com) at a later date.
This program is tentative and subject to change.
Mon 2 FebDisplayed time zone: Hobart change
09:50 - 11:10 | |||
09:50 20mTalk | $C^3$ : CXL Coherence Controllers for Heterogeneous Architectures Main Conference David Schall Technical University of Munich, Anatole Lefort Technical University of Munich (TUM), Nicolò Carpentieri Technical University of Munich, Julian Pritzi Technical University of Munich, Soham Chakraborty TU Delft, Nicolai Oswald NVIDIA, Pramod Bhatotia TU Munich | ||
10:10 20mTalk | Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation Main Conference Yanjing Wang National University of Defense Technology, Lizhou Wu National University of Defense Technology, Sunfeng Gao National University of Defense Technology, Yibo Tang National University of Defense Technology, Junhui Luo National University of Defense Technology, Zicong Wang National University of Defense Technology, Yang Ou National University of Defense Technology, Dezun Dong NUDT, Nong Xiao National University of Defense Technology & Sun Yat-sen University, Mingche Lai National University of Defense Technology | ||
10:30 20mTalk | Supporting High-performance Write-through Cache-Coherence Protocols under TSO Main Conference Burak Ocalan University of Illinois Urbana-Champaign, Chloe Alverti University of Illinois at Urbana-Champaign, Shashwat Jaiswal University of Illinois Urbana-Champaign, USA, Antonis Psistakis University of Illinois Urbana-Champaign, David Koufaty Unaffiliated, Suyash Mahar UC San Diego, Steven Swanson University of California San Diego, Josep Torrellas University of Illinois at Urbana-Champaign | ||
10:50 20mTalk | Deadlock-Free Bridge Module for Inter-Chiplet Communication in Open Chiplet Ecosystem Main Conference Zhiqiang Chen National University of Defense Technology, Wenwen Fu National University of Defense Technology, Yongwen Wang National University of Defense Technology, Hongwei Zhou National University of Defense Technology | ||
09:50 - 11:10 | |||
09:50 20mTalk | Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models Main Conference Chiyue Wei Duke University, Cong Guo Duke University, Junyao Zhang Duke University, Haoxuan Shan Duke University, Yifan Xu Duke University, Ziyue Zhang Duke University, Yudong Liu Duke University, Qinsi Wang Duke University, Changchun Zhou Duke University, Hai "Helen" Li Duke University, Yiran Chen Duke University | ||
10:10 20mTalk | LoCaLUT: Harnessing Capacity–Computation Tradeoffs for LUT-Based Inference in DRAM-PIM Main Conference Junguk Hong Seoul National University, Changmin Shin Seoul National University, Sukjin Kim Seoul National University, Si Ung Noh Seoul National University, Taehee Kwon Seoul National University, Seongyeon Park Seoul National University, Hanjun Kim Yonsei University, Youngsok Kim Yonsei University, Jinho Lee Seoul National University | ||
10:30 20mTalk | RPU - A Reasoning Processing Unit Main Conference Matthew Adiletta Harvard University, David Brooks Harvard University, Gu-Yeon Wei Harvard University | ||
10:50 20mTalk | PinDrop: Breaking the Silence on SDCs in a Large-Scale Fleet Main Conference Peter W. Deutsch Massachusetts Institute of Technology/Meta, Harish D. Dixit Meta, Gautham Vunnam Meta, Carl Moran Meta, Eleanor Ozer Meta, Sriram Sankar Meta | ||
09:50 - 11:10 | |||
09:50 20mTalk | UniFHE: Faster Accelerator for FHE with Diverse Algebraic Structure and Balanced Memory System Main Conference Qingyun Niu Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS and School of Cyber Security, University of Chinese Academy of Sciences, Lutan Zhao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Ming Cai Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS and School of Cyber Security, University of Chinese Academy of Sciences, kai li Institute of Information Engineering,CAS, Dan Meng Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Rui Hou Institute of Information Engineering, CAS | ||
10:10 20mTalk | Leveraging ASIC AI Chips for Homomorphic Encryption Main Conference Jianming Tong Georgia Institute of Technology, Tianhao Huang MIT, Leo de Castro MIT, Anirudh Itagi Georgia Institute of Technology, Jingtian Dang Georgia Tech, Anupam Golder Georgia Institute of Technology, Asra Ali Google, Jevin Jiang Google, Jeremy Kun Google, Arvind Massachusetts Institute of Technology, G. Edward Suh Cornell University, USA, Tushar Krishna Georgia Institute of Technology Pre-print | ||
10:30 20mTalk | CROPHE: Cross-Operator Dataflow Optimization for Fully Homomorphic Encryption Accelerators Main Conference Xinhua Chen Fudan University, Jiangbin Dong Xi'an Jiaotong University, Hongren Zheng Tsinghua University, Tian Tang Tsinghua University, Mingyu Gao Tsinghua University | ||
10:50 20mTalk | Peregrine: Accelerating TFHE Bootstrapping on GPUs via Multi-Level External Product Co-Design Main Conference Haoqi He State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, Chinese Academy of Sciences and School of Cyber Security, University of Chinese Academy of Sciences, Zhiwei Wang State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Lutan Zhao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Dian Jiao State Key Laboratory of Cyberspace Security Defense, Institute of Information Engineering, CAS, Dan Meng Institute of Information Engineering at Chinese Academy of Sciences; University of Chinese Academy of Sciences, Rui Hou Institute of Information Engineering, CAS | ||
11:10 - 11:30 | |||
11:10 20mCoffee break | Break HPCA/CGO/PPoPP/CC Catering | ||
11:30 - 12:50 | |||
11:30 20mTalk | MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT Main Conference Hritvik Taneja Georgia Tech, Ali Hajiabadi ETH Zurich, Michele Marazzi ABB Research, Kaveh Razavi ETH Zürich, Moinuddin K. Qureshi Georgia Tech | ||
11:50 20mTalk | SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense Main Conference Moinuddin K. Qureshi Georgia Tech | ||
12:10 20mTalk | ReScue: Reliable and Secure CXL Memory Main Conference Chihun Song UIUC, Austin Antony Cruz UIUC, Michael Jaemin Kim Meta, Minbok Wi Seoul National University, Gaohan Ye UIUC, Kyungsan Kim Samsung Electronics, Sangyeol Lee Samsung Electronics, Jung Ho Ahn Seoul National University, Nam Sung Kim UIUC | ||
12:30 20mTalk | Secret Caching Sauce for High-Performance Secure Memory Main Conference Xu Jiang Huazhong University of Science and Technology, Xueliang Wei Huazhong University of Science and Technology, YiFei Qu Huazhong University of Science and Technology, Dan Feng Huazhong University of Science and Technology, China, Yulai Xie Huazhong University of Science and Technology, Wei Tong Huazhong University of Science and Technology, China | ||
11:30 - 12:50 | |||
11:30 20mTalk | PIMphony: Overcoming Bandwidth and Capacity Inefficiency in PIM-based Long-Context LLM Inference System Main Conference hyucksung kwon Hanyang University, Kyungmo Koo Hanyang University, Janghyeon Kim Hanyang University, Woongkyu Lee Hanyang University, Minjae Lee Hanyang University, Gyeonggeun Jung KAIST, Hyungdeok Lee Solution Advanced Technology, SK hynix, Yousub Jung Solution Advanced Technology, SK hynix, Jaehan Park Solution Advanced Technology, SK hynix, Yosub Song Solution Advanced Technology, SK hynix, Byeongsu Yang Solution Advanced Technology, SK hynix, Haerang Choi Solution Advanced Technology, SK hynix, Guhyun Kim Solution Advanced Technology, SK hynix, Jongsoon Won Solution Advanced Technology, SK hynix, Woojae Shin Solution Advanced Technology, SK hynix, Changhyun Kim Solution Advanced Technology, SK hynix, Shin Gyeongcheol Solution Advanced Technology, SK hynix, Yongkee Kwon Tenstorrent, Ilkon Kim Solution Advanced Technology, SK hynix, Euicheol Lim SK hynix, John Kim KAIST, Jungwook Choi Hanyang University | ||
11:50 20mTalk | Adaptive Draft Sequence Length: Enhancing Speculative Decoding Throughput on PIM-Enabled Systems Main Conference Runze Wang Huazhong University of Science and Technology, Qinggang Wang Huazhong University of Science and Technology, Haifeng Liu Huazhong University of Science and Technology, Long Zheng Huazhong University of Science and Technology, XIAOFEI LIAO Huazhong University of Science and Technology, Hai Jin Huazhong University of Science and Technology, Jingling Xue University of New South Wales | ||
12:10 20mTalk | Conduit: Programmer-Transparent Near-Data Processing Using Multiple Compute-Capable Resources in SSDs Main Conference Rakesh Nadig ETH Zurich, Vamanan Arulchelvan ETH Zurich, Mayank Kabra ETH Zurich, Harshita Gupta ETH Zurich, Rahul Bera ETH Zurich, Nika Mansouri Ghiasi ETH Zurich, Nanditha Rao ETH Zurich, Qingcai Jiang ETH Zurich, Andreas Kosmas Kakolyris ETH Zurich, Yu Liang ETH Zurich, Mohammad Sadrosadati ETH Zürich, Onur Mutlu ETH Zurich | ||
12:30 20mTalk | Inter-Die Interconnection Networks for Reducing Peak Current Overlaps in Next-Generation NAND Systems Main Conference | ||
12:50 - 14:10 | |||
12:50 80mLunch | Lunch HPCA/CGO/PPoPP/CC Catering | ||
14:10 - 15:30 | |||
14:10 20mTalk | Predicting DRAM Failures at Scale: A Two-Stage Approach for Heterogeneous Systems Main Conference Chenglin Wang Xiamen University, Shouxin Wang Xiamen University, Shuyue Zhou Xiamen University, Ronglong Wu Xiamen University, Zhirong Shen Xiamen University, Lu Tang Xiamen University, Yiming Zhang Xiamen University, Jialiang Yu Huawei, Min Zhou Huawei | ||
14:30 20mTalk | MemSOS: OS-Guided Selective Memory Mirroring Main Conference Junghoon Kim Seoul National University & Samsung Electronics, Jongheon Jeong Seoul National University, Seokwon Moon Seoul National University, Seong Hoon Seo Seoul National University, Yeonhong Park Seoul National University, Jinkyu Jeong Yonsei University, Nam Sung Kim UIUC, Jae W. Lee Seoul National University | ||
14:50 20mTalk | ASPA: Reassigning DDR5 Parity Bandwidth Main Conference Fan Li University of Central Florida, Qiufeng Li George Washington University, Yanan Guo University of Rochester, Weidong Cao George Washington University, Xin Xin University of Central Florida | ||
15:10 20mTalk | HR-DCIM: \underline{H}igh-\underline{R}eliability Floating-Point \underline{D}igital \underline{CIM} Architecture with Unified Low-Cost Iterative Error Correction Main Conference Zhen He Tsinghua University, Yiqi Wang Tsinghua University, Zhiheng Yue Tsinghua University, Zihan Wu Tsinghua University, Huiming Han Tsinghua University, Shaojun Wei Tsinghua University, Yang Hu Tsinghua University, Fengbin Tu The Hong Kong University of Science and Technology, Shouyi Yin Tsinghua University | ||
14:10 - 15:30 | |||
14:10 20mTalk | Towards Resource-Efficient Serverless LLM Inference with SLINFER Main Conference | ||
14:30 20mTalk | ELORA: Efficient LoRA and KV Cache Management for Multi-LoRA LLM Serving Main Conference Jiuchen Shi Shanghai Jiao Tong University & The Hong Kong Polytechnic University, Hang Zhang Shanghai Jiao Tong University, Yixiao Wang Shanghai Jiao Tong University, Quan Chen Shanghai Jiao Tong University, China, Yizhou Shan Huawei Cloud, Kaihua Fu Hong Kong University of Science and Technology, Wei Wang Hong Kong University of Science and Technology, Minyi Guo Shanghai Jiao Tong University | ||
14:50 20mTalk | PASCAL: A Phase-Aware Scheduling Algorithm for Serving Reasoning-based Large Language Models Main Conference | ||
15:10 20mTalk | The Cost of Dynamic Reasoning: Demystifying AI Agents and Test-Time Scaling from an AI Infrastructure Perspective Main Conference | ||
14:10 - 15:30 | |||
14:10 20mTalk | CLINE: Improving Control Flow Compilation of Quantum Programs with Control Line Encoding Main Conference Anbang Wu Shanghai Jiao Tong University, Liqiang Lu Zhejiang University, Jianwei Yin Zhejiang University, Jingwen Leng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University | ||
14:30 20mTalk | Fully Parallelized BP Decoding for Quantum LDPC Codes Can Outperform BP-OSD Main Conference Ming Wang North Carolina State University, Ang Li Pacific Northwest National Laboratory, Frank Mueller North Carolina State University, USA | ||
14:50 20mTalk | DC-MBQC: A Distributed Quantum Compilation Framework for Measurement-Based Quantum Computing Main Conference Yecheng Xue Peking University, Rui Yang Peking University, Zhiding Liang The Chinese University of Hong Kong, Tongyang Li Peking University | ||
15:10 20mTalk | TraceQ: Trace-Based Reconstruction of Quantum Circuit Dataflow in Surface-Code Fault-Tolerant Quantum Computing Main Conference Theodoros Trochatos Yale University, Christopher Kang University of Chicago, Andrew Wang Cornell University, Frederic T. Chong University of Chicago, Jakub Szefer Northwestern University | ||
15:30 - 15:50 | |||
15:30 20mCoffee break | Break HPCA/CGO/PPoPP/CC Catering | ||
15:50 - 17:10 | |||
15:50 20mTalk | The Memory Processing Unit: A Generalized Interface for End-to-End In-Memory Execution Main Conference Minh S. Q. Truong Carnegie Mellon University, Yiqiu Sun University of Illinois Urbana-Champaign, Dawei Xiong University of Illinois Urbana-Champaign, Amol Shah University of Illinois Urbana-Champaign, Alex Glass Carnegie Mellon University, Abraham Farrell University of Illinois Urbana-Champaign, James A. Bain Carnegie Mellon University, L. Richard Carley Carnegie Mellon University, Saugata Ghose University of Illinois Urbana-Champaign | ||
16:10 20mTalk | CoCoTree: A Computation-Capable Architecture for Collective Communication in Scalable PIM Main Conference Shunchen Shi Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Qijia Yang Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Fan Yang Institute of Computing Technology, Chinese Academy of Science, Yu Huang Huazhong University of Science and Technology, Youwei Zhuo Peking University, Zhichun Li Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Ninghui Sun State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Xueqi Li State Key Lab of Processors, Institute of Computing Technology, CAS | ||
16:30 20mTalk | PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures Main Conference | ||
16:50 20mTalk | Count2Multiply: Reliable In-Memory High-Radix Counting Main Conference Joao Paulo Cardoso de Lima TU Dresden, ScaDS.AI, Benjamin F. Morris III Duke University, Asif Ali Khan TU Dresden, Germany, Jeronimo Castrillon TU Dresden, Germany, Alex Jones Syracuse University | ||
15:50 - 17:10 | |||
15:50 20mTalk | PADE: A Predictor-Free Sparse Attention Accelerator via Unified Execution and Stage Fusion Main Conference Huizheng Wang Tsinghua University, Hongbin Wang Tsinghua University, Zichuan Wang Tsinghua University, Zhiheng Yue Tsinghua University, Yang Wang Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
16:10 20mTalk | AQPIM: Breaking the PIM Capacity Wall for LLMs with In-Memory Activation Quantization Main Conference Kosuke Matsushima Institute of Science Tokyo, Yasuyuki Okoshi Institute of Science Tokyo, Masato Motomura Institute of Science Tokyo, Daichi Fujiki Institute of Science Tokyo | ||
16:30 20mTalk | BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache Main Conference Dayou Du University of Edinburgh, Shijie Cao Microsoft Research, Jianyi Cheng University of Edinburgh, UK, Luo Mai University of Edinburgh, Ting Cao Institute for AI Industry Research (AIR), Tsinghua University, Mao Yang Microsoft Research | ||
16:50 20mTalk | GyRot: Leveraging Hidden Synergy between Rotation and Fine-grained Group Quantization for Low-bit LLM Inference Main Conference | ||
15:50 - 17:10 | |||
15:50 20mTalk | GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering Main Conference Junseo Lee Seoul National University, Sangyun Jeon Seoul National University, Jungi Lee Seoul National University, Junyong Park Seoul National University, Jaewoong Sim Seoul National University | ||
16:10 20mTalk | Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing Main Conference Xiaotong Huang Shanghai Jiao Tong University, He Zhu Shanghai Jiao Tong University, Tianrui Ma Institute of Computing Technology, Chinese Academy of Sciences, Yuxiang Xiong Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Zhezhi He Shanghai Jiao Tong University, Yiming Gan Institute of Computing Technology, Chinese Academy of Sciences, Zihan Liu Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University | ||
16:30 20mTalk | FractalCloud: A Fractal-Inspired Architecture for Efficient Large-Scale Point Cloud Processing Main Conference Yuzhe Fu Duke University, Changchun Zhou Duke University, Hancheng Ye Duke University, Bowen Duan Duke University, Qiyu Huang Yale University, Chiyue Wei Duke University, Cong Guo Duke University, Hai "Helen" Li Duke University, Yiran Chen Duke University | ||
16:50 20mTalk | ORANGE: Exploring \underline{O}ckham's \underline{R}azor for Neural Rendering by \underline{A}ccelerating 3DGS on \underline{N}PUs with \underline{GE}MM-Friendly Blending and Balanced Workloads Main Conference Haomin Li Shanghai Jiao Tong University, Yue Liang Shanghai Jiao Tong University, Fangxin Liu Shanghai Jiao Tong University, Bowen Zhu Shanghai Jiao Tong University, Zongwu Wang Shanghai Jiao Tong University, Yu Feng Shanghai Jiao Tong University, Liqiang Lu Zhejiang University, Li Jiang Shanghai Jiaotong University, Haibing Guan Shanghai Jiao Tong University | ||
17:30 - 19:00 | |||
17:30 90mMeeting | Business Meeting Main Conference | ||
Tue 3 FebDisplayed time zone: Hobart change
09:50 - 11:10 | |||
09:50 20mTalk | The Last-Level Branch Predictor Revisited Main Conference David Schall Technical University of Munich, Mária Ďuračková University Of Edinburgh, Boris Grot University of Edinburgh, UK | ||
10:10 20mTalk | Tempranillo: Non-Speculative Early Register Release Main Conference Carlos Escuin Computing Systems Lab, Huawei Technologies Switzerland AG, Paolo Salvatore Galfano Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland, Davide Basilio Bartolini Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland, Leeor Peled Boole Labs, Tel-Aviv Research Center, Huawei Technologies, Israel, Mehdi Alipour Computing Systems Laboratory, Zurich Research Center, Huawei Technologies, Switzerland | ||
10:30 20mTalk | SMTcheck: Accurate SMT Interference Prediction to Improve Scheduling Efficiency in Datacenters Main Conference Sanghyun Kim Sungkyunkwan University, Jinhyeok Oh Sungkyunkwan University, Taehun Kim Sungkyunkwan University, Gyutae Kim Sungkyunkwan University, Youngsok Kim Yonsei University, Jaehyun Hwang Sungkyunkwan University, Joonsung Kim Sungkyunkwan University | ||
10:50 20mTalk | I-POP: Ignite Positive Prefetchers Main Conference Yiquan Lin Zhejiang University and Alibaba Group, Wenhai Lin Alibaba Group, Yiquan Chen Alibaba Group, Jiexiong Xu Zhejiang University and Alibaba Group, Shishun Cai Alibaba Group, Jiarong Ye Zhejiang University, Zonghui Wang Zhejiang University, Wenzhi Chen Zhejiang University | ||
09:50 - 11:10 | |||
09:50 20mTalk | WATOS: Efficient LLM Training Strategies and Architecture Co-exploration for Wafer-scale Chip Main Conference Huizheng Wang Tsinghua University, Zichuan Wang Tsinghua University, Hongbin Wang Tsinghua University, Jingxiang Hou Tsinghua University, Taiquan Wei Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
10:10 20mTalk | FACE: Fully PD Overlapped Scheduling and Multi-Level Architecture Co-Exploration on Wafer Main Conference Zheng Xu Tsinghua University, Dehao Kong Tsinghua University, Jiaxin Liu Tsinghua University, Dingcheng Jiang Tsinghua University, Xu Dai Shanghai Artificial Intelligence Laboratory, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
10:30 20mTalk | TEMP: A Memory Efficient Physical-aware Tensor Partition-Mapping Framework on Wafer-scale Chips Main Conference Huizheng Wang Tsinghua University, Taiquan Wei Tsinghua University, Zichuan Wang Tsinghua University, Dingcheng Jiang Tsinghua University, Qize Yang Tsinghua University, Jiaxin Liu Tsinghua University, Jingxiang Hou Tsinghua University, Chao Li Shanghai Jiao Tong University, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
10:50 20mTalk | MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference Main Conference Xinru Tang Tsinghua University, Jingxiang Hou Tsinghua University, Dingcheng Jiang Tsinghua University, Taiquan Wei Tsinghua University, Jiaxin Liu Tsinghua University, Jinyi Deng Tsinghua University, Huizheng Wang Tsinghua University, Qize Yang Tsinghua University, Haoran Shang Tsinghua University, Chao Li Shanghai Jiao Tong University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
11:10 - 11:30 | |||
11:10 20mCoffee break | Break HPCA/CGO/PPoPP/CC Catering | ||
11:30 - 12:50 | |||
11:30 20mTalk | Athena: Synergizing Data Prefetching and Off-Chip Prediction via Online Reinforcement Learning Main Conference Zhenrong Lang ETH Zürich, Rahul Bera ETH Zurich, Caroline Hengartner ETH Zürich, Konstantinos Kanellopoulos ETH Zurich, Rakesh Kumar NTNU, Mohammad Sadrosadati ETH Zürich, Onur Mutlu ETH Zurich | ||
11:50 20mTalk | Streamlined On-Chip Temporal Prefetching Main Conference | ||
12:10 20mTalk | Intermittence-Aware Cache Compression Main Conference Gan Fang Purdue University, Jianping Zeng Arizona State University, Yuchen Zhou Purdue University, Changhee Jung Purdue University, USA | ||
12:30 20mTalk | SnakeMan: Applying Relation-centric Notation to Model and Optimize Data Swizzle in the Cache of Modern NPU Main Conference Hanyu Zhang Zhejiang University, Fangxu Guo Zhejiang University, Liqiang Lu Zhejiang University, Long Wang Huawei Technologies, Yunfei Du Huawei Technologies, Zhe Wang Huawei Technologies, Jinghan Zhang Huawei Technologies, Jie Zhang Peking University, Chenli Xue Zhejiang University, Chengpeng Wu Zhejiang University, Ziyi Zhang Zhejiang University, Eric Liang Peking University, Size Zheng ByteDance, Jianwei Yin Zhejiang University | ||
11:30 - 12:50 | |||
11:30 20mTalk | V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval Main Conference | ||
11:50 20mTalk | SFD: Towards Segment Fusion Dataflow for Spatial Accelerators Main Conference Fuyu Wang Sun Yat-sen University, Minghua Shen Sun Yat-sen University, Yufei Ding UCSD, Nong Xiao National University of Defense Technology & Sun Yat-sen University, Yutong Lu Sun Yat-sen University | ||
12:10 20mTalk | VAR-Turbo: Unlocking the Potential of Visual Autoregressive Models through Dual Redundancy Main Conference Xujiang Xiang The Hong Kong University of Science and Technology, Fengbin Tu The Hong Kong University of Science and Technology | ||
12:30 20mTalk | GauPHP: An Accelerator for 3D Gaussian Splatting Training with Gaussian-Pixel Hybrid Parallelism Main Conference Rui Wen Institute of Computing Technology, Chinese Academy of Sciences, Zhifei Yue University of Science and Technology of China, Tianbo Liu University of Science and Technology of China, Xinkai Song Institute of Computing Technology, Chinese Academy of Sciences, Jin Li Institute of Computing Technology, Chinese Academy of Sciences, Di Huang Chinese Academy of Sciences, Institute of Computing Technology, Jiaming Guo Institute of Computing Technology, Chinese Academy of Sciences, Xing Hu Institute of Computing Technology, Chinese Academy of Sciences, zidong du Institute of Computing Technology, Chinese Academy of Sciences, Qi Guo Chinese Academy of Sciences, Tianshi Chen Cambricon Technologies | ||
11:30 - 12:50 | |||
11:30 20mTalk | zkPHIRE: A Programmable Accelerator for ZKPs over HIgh-degRee, Expressive Gates Main Conference Alhad Daftardar New York University, Jianqiao Cambridge Mo New York University, Joey Ah-kiow New York University, Benedikt Bünz New York University, Siddharth Garg New York University, Brandon Reagen New York University | ||
11:50 20mTalk | Conflux: A High-Performance Keyword Private Retrieval System for Dynamic Datasets Main Conference Zehao Chen Shandong University, Zhaoyan Shen Shandong University, Qian Wei Shandong University, Hang Lu Institute of Computing Technology, Chinese Academy of Sciences, Lei Ju Shandong University | ||
12:10 20mTalk | An Efficient and Scalable Hardware Architecture for Number Theoretic Transform on FPGA with Design Automation Main Conference Yilan Zhu Ant Group, Geng Yang Ant Group, Xingyu Tian Simon Fraser University, Dilshan Kumarathunga Simon Fraser University, Liang Kong Ant Group, Xianglong Deng UCAS, Shengyu Fan UCAS, Guang Fan Ant Group, Guiming Shi Tsinghua University, Lei Chen University of Chinese Academy of Sciences, Bo Zhang Ant Group, Yisong Chang Ant Group, Shoumeng Yan Ant Group, Zhenman Fang Simon Fraser University, Mingzhe Zhang Ant Group | ||
12:30 20mTalk | IVE: An Accelerator for Single-Server Private Information Retrieval Using a Versatile Processing Element Main Conference Sangpyo Kim Seoul National University, Hyesung Ji Seoul National University, Jongmin Kim Seoul National University, Jaiyoung Park Seoul National University, Wonseok Choi Seoul National University, Jung Ho Ahn Seoul National University Pre-print | ||
12:50 - 14:10 | |||
12:50 80mLunch | Lunch HPCA/CGO/PPoPP/CC Catering | ||
14:10 - 15:30 | |||
14:10 20mTalk | BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism Main Conference | ||
14:30 20mTalk | RoMe: Row Granularity Access Memory System for Large Language Models Main Conference Hwayong Nam Seoul National University, Michael Jaemin Kim Meta, Seungmin Baek Seoul National University, Jumin Kim Seoul National University, Jung Ho Ahn Seoul National University Pre-print | ||
14:50 20mTalk | HDPAT: Hierarchical Distributed Page Address Translation for Wafer-Scale GPUs Main Conference daoxuan xu William & Mary, Ying Li William & Mary, Yuwei Sun UIUC, Jie Ren William & Mary, Yifan Sun William&Mary | ||
15:10 20mTalk | Pulse: Fine-Grained Hierarchical Hashing Index for Disaggregated Memory Main Conference Guangyang Deng Xiamen University, Zixiang Yu Xiamen University, Zhirong Shen Xiamen University, Qiangsheng Su Xiamen University, Jiwu Shu Xiamen University | ||
14:10 - 15:30 | |||
14:10 20mTalk | LILo: Harnessing the On-chip Accelerators in Intel CPUs for Compressed LLM Inference Acceleration Main Conference Hyungyo Kim UIUC, Qirong Xia UIUC, Jinghan Huang UIUC, Nachuan Wang UIUC, Jung Ho Ahn Seoul National University, Younjoo Lee Seoul National University, Wajdi K Feghali Intel, Ren Wang Intel Labs, Nam Sung Kim UIUC | ||
14:30 20mTalk | ReThermal: Co-Design of Thermal-Aware Static and Dynamic Scheduling for LLM Training on Liquid-Cooled Wafer-Scale Chips Main Conference Chengran Li Tsinghua University, Huizheng Wang Tsinghua University, Jiaxin Liu Tsinghua University, Jingyao Liu Tsinghua University, Zhiheng Yue Tsinghua University, Xia Li Shanghai AI Lab, Shenfei Jiang Shanghai AI Lab, Jinyi Deng Tsinghua University, Yang Hu Tsinghua University, Shouyi Yin Tsinghua University | ||
14:50 20mTalk | TraceRTL: Agile Performance Evaluation for Microarchitecture Exploration Main Conference Zifei Zhang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Yinan Xu SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Sa Wang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, Dan Tang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; Beijing Institute of Open Source Chip, Yungang Bao State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences | ||
15:10 20mTalk | Nugget: Portable Program Snippets Main Conference Zhantong Qiu University of California, Davis, Mahyar Samani University of California, Davis, Jason Lowe-Power University of California, Davis & Google | ||
14:10 - 15:30 | |||
14:10 20mTalk | BASES: Enabling Energy-Efficient and Error-Resilient Analog CIM Acceleration via Reformation of Coding Bases Main Conference hongrui guo Institute of Computing Technology, Chinese Academy of Sciences, Tianrui Ma Institute of Computing Technology, Chinese Academy of Sciences, zidong du Institute of Computing Technology, Chinese Academy of Sciences, Mo Zou Institute of Computing Technology, Chinese Academy of Sciences, Yifan Hao ICT, Chinese Academy of Sciences, Yongwei Zhao Institute of Computing Technology, Chinese Academy of Sciences, Rui Zhang Chinese Academy of Sciences, Wei Li Institute of Software Chinese Academy of Sciences; University of Chinese Academy of Sciences, Xing Hu Institute of Computing Technology, Chinese Academy of Sciences, Zhiwei Xu Institute of Computing Technology of the Chinese Academy of Sciences, China, Qi Guo Chinese Academy of Sciences, Tianshi Chen Cambricon Technologies | ||
14:30 20mTalk | A PN-Free Digital SAT Accelerator Using Crossbar Architecture and Frequency-Controlled Counters Main Conference Zhezheng Ren University of Waterloo, Chenao Yuan University of Waterloo, Yuke Zhang University of Toronto, Shiyu Su University of Waterloo | ||
14:50 20mTalk | ESTroM: Element-Flow Architecture For Processing Sparse Tractable Probabilistic Models Main Conference anjunyi fan Peking University, Xuejie Liu Peking University, Anji Liu University of California, Los Angeles, Qiuping Wu Peking University, Jiaqi Yang Peking University, Yuchao Qin Peking University, Guy Van den Broeck University of California at Los Angeles, Yitao Liang Peking University, Bonan Yan Peking University | ||
15:10 20mTalk | GustavSNN: Unleashing the Power of Gustavson's Algorithm on SNN Acceleration with Column-Parallel Tick-Batch Dataflow Main Conference Sangwoo Hwang Korea University, Donghun Lee Korea University, Jahyun Koo DGIST, Jaeha Kung Korea University | ||
15:30 - 15:50 | |||
15:30 20mCoffee break | Break HPCA/CGO/PPoPP/CC Catering | ||
15:50 - 17:10 | |||
15:50 20mTalk | NPUWattch: ML-based Power, Area, and Timing Modeling for Neural Accelerators Main Conference Sehyeon Kim Yonsei University, Minkwan Kim Yonsei University, Chanho Park Yonsei University, Hanmok Park Kyungpook National University, Seonghoon Kim Kyungpook National University, Taigon Song Kyungpook National University, William Song Yonsei University | ||
16:10 20mTalk | Area Bloating and the Future of Specialization Main Conference | ||
16:30 20mTalk | Advancing Full-stack Acceleration for Schrödinger-Style Quantum Simulation Main Conference Shuang Liang Imperial College London, Yuncheng Lu Imperial College London, Ce Guo Imperial College London, Paul H J Kelly Imperial College London, Wayne Luk Imperial College London, Hongxiang Fan Imperial College London | ||
16:50 20mTalk | COMET: Communication and Memory Co-Design for Fine-Grained AI Inference in MCM Accelerators Main Conference Taishu Sheng College of Computer Science and Technology, National University of Defense Technology, Guangyu Sun Peking University, Dezun Dong NUDT | ||
15:50 - 17:10 | |||
15:50 20mTalk | Compression-Aware Gradient Splitting for Collective Communications in Distributed Training Main Conference Pranati Majhi Texas A&M University, Sabuj Laskar Texas A&M University, Abdullah Muzahid Texas A & M University, Eun Jung Kim | ||
16:10 20mTalk | SCALE: Tackling Communication Bottlenecks in Confidential Multi-GPU ML Main Conference Joongun Park Georgia Tech, Yongqin Wang University of Southern California, Huan Xu Georgia Institute of Technology, Hanjiang Wu Georgia Institute of Technology, Mengyuan Li USC, Tushar Krishna Georgia Institute of Technology | ||
16:30 20mTalk | AutoHAAP: Automated Heterogeneity-Aware Asymmetric Partitioning for LLM Training Main Conference Yuanyuan Wang Zhejiang Lab, Nana Tang Zhejiang Lab, Yuyang Wang Zhejiang Lab, Shu Pan Zhejiang Lab, Dingding Yu Zhejiang Lab, Zeyue Wang Zhejiang Lab, Mou Sun Zhejiang Lab, Kejie Fu Zhejiang Lab, Fangyu Wang Zhejiang Lab, Yunchuan Chen Zhejiang Lab, Ning Sun Zhejiang Lab, Fei Yang Zhejiang Lab | ||
16:50 20mTalk | Towards Compute-Aware In-Switch Computing for LLMs Tensor-Parallelism on Multi-GPU Systems Main Conference Chen Zhang Shanghai Jiao Tong University, Qijun Zhang Shanghai Jiao Tong University, Zhuoshan Zhou Shanghai Jiao Tong University, Yijia Diao Shanghai Jiao Tong University, Haibo Wang Huawei, Zhe Zhou Huawei, Zhipeng Tu Huawei, Zhiyao Li Huawei, Guangyu Sun Peking University, Zhuoran Song Shanghai Jiao Tong University, Zhigang Ji Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University | ||
15:50 - 17:10 | |||
15:50 20mTalk | Uni-STC: Unified Sparse Tensor Core Main Conference Haocheng Lian China University of Petroleum-Beijing, Qiyue Zhang China University of Petroleum-Beijing, Xinran Zhao China University of Petroleum-Beijing, Meichen Dong China University of Petroleum-Beijing, Yijie Nie China University of Petroleum-Beijing, Zhengyi Zhao China University of Petroleum-Beijing, Junzhong Shen National University of Defense Technology, Wei Guo National University of Defense Technology, Chun Huang National University of Defense Technology, Bingcai Sui National University of Defense Technology, Weifeng Liu China University of Petroleum-Beijing | ||
16:10 20mTalk | AUM: Unleashing the Efficiency Potential of Shared Processors with Accelerator Units for LLM Serving Main Conference Xinkai Wang Shanghai Jiao Tong University, Chao Li Shanghai Jiao Tong University, Yiming Zhuansun Shanghai Jiao Tong University, Jinyang Guo Shanghai Jiao Tong University, Xiaofeng Hou Shanghai Jiao Tong University, Jing Wang Shanghai Jiao Tong University, Luping Wang Alibaba Group, Weigao Chen Alibaba Group, Cheng Huang Alibaba Group, Guodong Yang Alibaba Group, Liping Zhang Alibaba Group, Minyi Guo Shanghai Jiao Tong University | ||
16:30 20mTalk | DRACO: A Hardware-Efficient Robot Rigid Body Dynamics Accelerator with Precision-Aware Quantization Framework Main Conference Xingyu Liu The Hong Kong University of Science and Technology, Jiawei Liang The Hong Kong University of Science and Technology, Yipu Zhang The Hong Kong University of Science and Technology, Linfeng Du The Hong Kong University of Science and Technology, Chaofang Ma The Hong Kong University of Science and Technology, Hui Yu Hong Kong University of Science and Technology, Xu Jiang University of Electronic Science and Technology of China, Wei Zhang The Hong Kong University of Science and Technology | ||
16:50 20mTalk | REASON: Accelerating Probabilistic Logical Reasoning for Neuro-Symbolic Cognitive Intelligence Main Conference Zishen Wan Georgia Institute of Technology, Che-Kai Liu Georgia Institute of Technology, Jiayi Qian Georgia Institute of Technology, Hanchen Yang Georgia Institute of Technology, Arijit Raychowdhury Georgia Institute of Technology, Tushar Krishna Georgia Institute of Technology | ||
17:15 - 18:15 | |||
17:15 20mTalk | GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping Main Conference Julien Eudine Huawei Technologies Switzerland AG, Chu Li Huawei Zurich Research Center, Zhuo Cheng Huawei Zurich Research Center, Renzo Andri Huawei Technologies Switzerland AG, Onur Mutlu ETH Zurich, Can Firtina ETH Zurich and UMD, Mohammad Sadrosadati ETH Zürich, Nika Mansouri Ghiasi ETH Zurich, Konstantina Koliogeorgi ETH Zurich, Anirban Nag Huawei Zurich Research Center, Arash Tavakkol Huawei Zurich Research Center, Haiyu Mao King's College London, Shai Bergman Huawei Zurich Research Center, Ji Zhang Huawei Zurich Research Center | ||
17:35 20mTalk | SAGe: A Lightweight Algorithm-Architecture Co-Design for Mitigating the Data Preparation Bottleneck in Large-Scale Genome Sequence Analysis Main Conference Nika Mansouri Ghiasi ETH Zurich, Talu Güloglu ETH Zurich, Harun Mustafa ETH Zurich and Johns Hopkins University, Can Firtina ETH Zurich and UMD, Konstantina Koliogeorgi ETH Zurich, Konstantinos Kanellopoulos ETH Zurich, Haiyu Mao King's College London, Rakesh Nadig ETH Zurich, Mohammad Sadrosadati ETH Zürich, Jisung Park POSTECH (Pohang University of Science and Technology), Onur Mutlu ETH Zurich | ||
17:55 20mTalk | NP-CAM: Efficient and Scalable DNA Classification using a NoC-Partitioned CAM Architecture Main Conference Benjamin F. Morris III Duke University, Tergel Molom-Ochir Duke University, Changchun Zhou Duke University, Yiran Chen Duke University, Alex Jones Syracuse University, Hai "Helen" Li Duke University | ||
18:30 - 21:30 | |||
18:30 3hSocial Event | Excursion HPCA/CGO/PPoPP/CC Catering | ||
Wed 4 FebDisplayed time zone: Hobart change
09:50 - 11:10 | |||
09:50 20mTalk | DSASSASSIN: Cross-VM Side-Channel Attacks by Exploiting Intel Data Streaming Accelerator Main Conference Ben Chen The Hong Kong University of Science and Technology (Guangzhou), Kunlin Li The Hong Kong University of Science and Technology (Guangzhou), Shuwen Deng Tsinghua University, Dongsheng Wang Tsinghua University, Yun Chen The Hong Kong University of Science and Technology (Guangzhou) | ||
10:10 20mTalk | SSBleed: Non-speculative Side-channel Attacks via Speculative Store Bypass on Armv9 CPUs Main Conference Chang Liu Tsinghua University, Hongpei Zheng Tsinghua University, Xin Zhang Peking University, Dapeng Ju Tsinghua University, Dongsheng Wang Tsinghua University, Yinqian Zhang Southern University of Science and Technology, Trevor E. Carlson National University of Singapore | ||
10:30 20mTalk | Protean: A Programmable Spectre Defense Main Conference Nicholas Mosier Stanford University, Hamed Nemati KTH Royal Institute of Technology, John C. Mitchell Stanford University, Caroline Trippel Stanford University | ||
10:50 20mTalk | HERO-Sign: Hierarchical Tuning and Efficient Compiler-Time GPU Optimizations for SPHINCS$^+$ Signature Generation Main Conference Yaoyun Zhou University of California, Merced, Qian Wang University of California, Merced (UC Merced) | ||
09:50 - 11:10 | |||
09:50 20mTalk | VeloxGNN: Accelerating Out-of-Core based GNN Training with Low Data Migration and High Accuracy via Delayed Gradient Propagation Main Conference Yi Li University of Texas at Dallas, Tsun-Yu Yang Center for Computational Evolutionary Intelligence, Electrical & Computer Engineering, Duke University, Zhaoyan Shen Shandong University, Ming-Chang Yang The Chinese University of Hong Kong (CUHK), Bingzhe Li University of Texas at Dallas | ||
10:10 20mTalk | AutoGNN: End-to-End Hardware-Driven Graph Preprocessing for Enhanced GNN Performance Main Conference Seungkwan Kang KAIST, Seungjun Lee KAIST, Donghyun Gouk Panmnesia, Miryeong Kwon Panmnesia, Hyunkyu Choi Panmnesia, Junhyeok Jang Panmnesia, Sangwon Lee Panmnesia, Huiwon Choi KAIST, Jie Zhang Peking University, Wonil Choi Hanyang University, Mahmut Taylan Kandemir Pennsylvania State University, Myoungsoo Jung KAIST | ||
10:30 20mTalk | Scaling Graph Neural Network Training via Geometric Optimization Main Conference Fangzhou Ye University of Central Florida, Lingxiang Yin University of Central Florida, Hao Zheng University of Central Florida | ||
10:50 20mTalk | VectorLiteRAG: Latency-Aware and Fine-Grained Resource Partitioning for Efficient RAG Main Conference | ||
09:50 - 11:10 | |||
09:50 20mTalk | μShare: Non-Intrusive Kernel Co-Locating on NVIDIA GPUs Main Conference Wenhao Huang Tianjin University, Zhaolin Duan Tianjin University, Laiping Zhao Tianjin University, Yuhao Zhang Tianjin University, Yanjie Wang Tianjin University, Yiming Li Tianjin University, Yihan Wang Tianjin University, Yichi Chen Tianjin University, Zhihang Tang Tianjin University, Kang Chen Tsinghua University, Deze Zeng China University of Geosciences, Wenxin Li Tianjin University, Keqiu Li Tianjin University | ||
10:10 20mTalk | FlashFuser: Expanding the Scale of Kernel Fusion for Compute-Intensive operators via Inter-Core Connection Main Conference huang ziyu Shanghai Jiao Tong University, Yangjie Zhou National University of Singapore, Zihan Liu Shanghai Jiao Tong University, Xinhao Luo Shanghai Jiao Tong University, Yijia Diao Shanghai Jiao Tong University, Minyi Guo Shanghai Jiao Tong University, Jidong Zhai Tsinghua University, Yu Feng Shanghai Jiao Tong University, Chen Zhang Shanghai Jiao Tong University, Anbang Wu Shanghai Jiao Tong University, Jingwen Leng Shanghai Jiao Tong University | ||
10:30 20mTalk | Swift: High-Performance Sparse-Dense Matrix Multiplication on GPUs Main Conference Jinyu Hu Hunan University, Huizhang Luo Hunan University, Hong Jiang UT Arlington, Marc Casas Barcelona Supercomputing Center, Kenli Li National Supercomputing Center in Changsha, Hunan University, Chubo Liu Hunan University | ||
10:50 20mTalk | QuCo: Efficient and Flexible Hardware-Driven Automatic Configuration of Tile Transfers in GPUs Main Conference Nicolas Meseguer University of Murcia, daoxuan xu William & Mary, Yifan Sun William&Mary, Michael Pellauer Nvidia, José L. Abellán University of Murcia, Manuel E. Acacio Universidad de Murcia (UMU) | ||
11:10 - 11:30 | |||
11:10 20mCoffee break | Break HPCA/CGO/PPoPP/CC Catering | ||
11:30 - 12:50 | |||
11:30 20mTalk | RidgeWalker: Perfectly Pipelined Graph Random Walks on FPGAs Main Conference Hongshi Tan National University of Singapore, Yao CHEN , Xinyu Chen Hong Kong University of Science and Technology, Qizhen Zhang University of Toronto, Cheng Chen ByteDance, China, Weng-Fai Wong National University of Singapore, Bingsheng He National University of Singapore | ||
11:50 20mTalk | DP-HLS: A High-Level Synthesis Framework for Accelerating Dynamic Programming Algorithms in Bioinformatics Main Conference Anshu Gupta UC San Diego, Yingqi Cao UC San Diego, Jason Liang UC San Diego, Yatish Turakhia UC San Diego | ||
12:10 20mTalk | Sassy: SmartNIC-Assisted Notification Delivery for μs-scale RDMA Workloads Main Conference | ||
12:30 20mTalk | TurboFuzz: FPGA Accelerated Hardware Fuzzing for Processor Agile Verification Main Conference Yang Zhong Institute of Computing, Chinese Academy of Sciences, Haoran Wu University of Cambridge, Xueqi Li State Key Lab of Processors, Institute of Computing Technology, CAS, Sa Wang SKLP, Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences, David Boland The University of Sydney, Yungang Bao State Key Lab of Processors, Institute of Computing Technology, CAS; University of Chinese Academy of Sciences, Kan Shi Institute of Computing, Chinese Academy of Sciences | ||
11:30 - 12:50 | |||
11:30 20mTalk | Near-Zero-Overhead Freshness for Recommendation Systems via Inference-Side Model Updates Main Conference Wenjun Yu Hong Kong Baptist University, Sitian Chen Hong Kong Baptist University, Amelie Chi Zhou Hong Kong Baptist University, Cheng Chen ByteDance, China | ||
11:50 20mTalk | AccelFlow: Orchestrating an On-Package Ensemble of Fine-Grained Accelerators for Microservices Main Conference Jovan Stojkovic University of Illinois at Urbana-Champaign, Abraham Farrell University of Illinois Urbana-Champaign, Zhangxiaowen Gong Intel, Christopher J. Hughes Intel, Josep Torrellas University of Illinois at Urbana-Champaign | ||
12:10 20mTalk | SpotCC: Facilitating Coded Computation for Prediction Serving Systems on Spot Instances Main Conference Lin Wang , Yuchong Hu Huazhong University of Science and Technology, Ziling Duan Huazhong University of Science and Technology, Mingqi Li Huazhong University of Science and Technology, Chenxuan Yao Huazhong University of Science and Technology, feifanliu Huazhong University of Science and Technology, Xiaolu Li Huazhong University of Science and Technology, Leihua Qin Huazhong University of Science and Technology, Dan Feng Huazhong University of Science and Technology, China | ||
12:30 20mTalk | LowCarb: Carbon-Aware Scheduling of Serverless Functions Main Conference | ||
11:30 - 12:50 | |||
11:30 20mTalk | Exploration of LLM Workload Reliability based on di/dt effects and Voltage Droops Main Conference Zhixing Jiang University of Texas at Austin, Justin Garrigus University of Texas at Austin, Allison Seigler University of Texas at Austin, Ethan Syed University of Texas at Austin, Yan-Lun Huang University of Texas at Austin, Mehdi Sadi Advanced Micro Devices, Tawfik Rahal-Arabi Advanced Micro Devices, Lizy John University of Texas, Austin | ||
11:50 20mTalk | ARIADNE: Adaptive UVM Management for Efficient GPU Memory Oversubscription Main Conference Hyunkyun Shin Yonsei University, Seongtae Bang DGIST, Hyungwon Park DGIST, Daehoon Kim Yonsei University | ||
12:10 20mTalk | LRM-GPU: Alleviating Synchronization Overhead for Multi-Chiplet GPU Architecture Main Conference Baiqing Zhong Sun Yat-Sen University, Zhirong Ye Sun Yat-Sen University, Xiaojie Li Sun Yat-Sen University, Peilin Wang Sun Yat-Sen University, Haiqiu Huang Sun Yat-Sen University, Zhaolin Li Tsinghua University, Zhiyi Yu Sun Yat-sen University, Mingyu Wang Sun Yat-Sen University | ||
12:30 20mTalk | LEGO: Supporting LLM-enhanced Games with One Gaming GPU Main Conference Han Zhao Shanghai Jiao Tong University, Weihao Cui Shanghai Jiao Tong University, Zeshen Zhang Tongji University, Wenhao Zhang Shanghai Jiao Tong University, Jiangtong Li Tongji University, Quan Chen Shanghai Jiao Tong University, China, Youmin Chen Shanghai Jiao Tong University, Pu Pang Shanghai Jiao Tong University, Zijun Li Shanghai Jiao Tong University, Zhenhua Han The University of Hong Kong, Yuqing Yang Microsoft Research, Minyi Guo Shanghai Jiao Tong University | ||
Accepted Papers
Open Conference Statements
IEEE Computer Society Open Conference Statements
For Technical Community Websites:
Expanding participation in computing is central to the goals of the IEEE Computer Society and all of its activities. The IEEE Computer Society is firmly committed to broad participation in all sponsored activities, including but not limited to, technical communities, steering committees, conference organizations, standards committees, and ad hoc committees that welcome the entire global community.
IEEE's mission to foster technological innovation and excellence to benefit humanity requires the talents and perspectives of people with many disciplinary backgrounds.
For Conference Websites:
Expanding participation in computing is central to the goals of the IEEE Computer Society and all of its conferences. The IEEE Computer Society is firmly committed to broad participation in all sponsored activities, including but not limited to, technical communities, steering committees, conference organizations, standards committees, and ad hoc committees that welcome the entire global community.
IEEE's mission to foster technological innovation and excellence to benefit humanity requires the talents and perspectives of people with many disciplinary backgrounds.
All individuals are entitled to participate in any IEEE Computer Society activity free of discrimination and harassment.