PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
The ability to dynamically allocate memory is fundamental in modern programming languages. However, this feature is not adequately supported in current general-purpose PIM devices. To identify key design principles that PIM must consider, we conduct a design space exploration of PIM memory allocators, examining various strategies for metadata placement and management of the allocator. Based on this exploration, we introduce PIM-malloc, a fast and scalable memory allocator for general-purpose PIM that operates on real PIM hardware, achieving a $66\times$ improvement in memory allocation performance. This design is further enhanced with a lightweight, per-PIM core hardware cache, specifically designed for dynamic memory allocation, achieving an additional $31%$ performance improvement. Finally, we demonstrate the applicability of PIM-malloc by developing several representative PIM workloads, demonstrating its effectiveness in enhancing programmability.
Mon 2 FebDisplayed time zone: Hobart change
15:50 - 17:10 | Processing-in-Memory ArchitecturesMain Conference at Collaroy Chair(s): Byeongho Kim Samsung Electronics | ||
15:50 20mTalk | The Memory Processing Unit: A Generalized Interface for End-to-End In-Memory Execution Main Conference Minh S. Q. Truong Carnegie Mellon University, Yiqiu Sun University of Illinois Urbana-Champaign, Dawei Xiong University of Illinois Urbana-Champaign, Amol Shah University of Illinois Urbana-Champaign, Alex Glass Carnegie Mellon University, Abraham Farrell University of Illinois Urbana-Champaign, James A. Bain Carnegie Mellon University, L. Richard Carley Carnegie Mellon University, Saugata Ghose University of Illinois Urbana-Champaign Link to publication | ||
16:10 20mTalk | CoCoTree: A Computation-Capable Architecture for Collective Communication in Scalable PIM Main Conference Shunchen Shi Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Qijia Yang Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Fan Yang Institute of Computing Technology, Chinese Academy of Science, Yu Huang Huazhong University of Science and Technology, Youwei Zhuo Peking University, Zhichun Li Institute of Computing Technology, Chinese Academy of Sciences ; University of Chinese Academy of Sciences, Ninghui Sun State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Xueqi Li State Key Lab of Processors, Institute of Computing Technology, CAS | ||
16:30 20mTalk | PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures Main Conference | ||
16:50 20mTalk | Count2Multiply: Reliable In-Memory High-Radix Counting Main Conference Joao Paulo Cardoso de Lima TU Dresden, ScaDS.AI, Benjamin F. Morris III Duke University, Asif Ali Khan TU Dresden, Germany, Jeronimo Castrillon TU Dresden, Germany, Alex Jones Syracuse University | ||