ReScue: Reliable and Secure CXL Memory
As CXL decouples the host CPU from a specific memory interface, it allows hyperscalers to recycle a massive number of past-generation DDR DRAM modules from their retired server fleets and cost-effectively expand the memory capacity and bandwidth of their servers with these modules, even when the host CPUs do not natively support them. Nonetheless, using such recycled DDR DRAM modules poses reliability and security challenges. First, these modules may have many uncorrectable/unrepairable faulty words due to years of strenuous use in harsh hyperscale environments. Second, they are more vulnerable to the latest attacks, such as Row Hammer (RH), than current-generation DDR DRAM modules because their defense solutions are designed for the previously known attacks. To address these challenges, we propose ReScue that exploits unique properties of CXL—support for variable-latency out-of-order responses for memory accesses and near-memory processing—and is implemented in an Intel Agilex platform for full-system evaluations. First, ReScue-R stores the remapping addresses to fault-free blocks in faulty 64-byte blocks. If it accesses a faulty block, it retrieves the remapping address, accesses the fault-free block, and returns it to the host CPU without blocking subsequent memory accesses, accomplishing fault tolerance 9.6 × 10^4 times higher than standard SECDED ECC with only 0.2% performance degradation. Second, after demonstrating that CXL also exposes the known security vulnerabilities of DDR DRAM, ReScue-S shapes the latency of memory accesses to prevent timing-based side-channel attacks, such as reconstructing physical-address-to-DRAM-address mapping functions for RH attacks, with 1.1% performance degradation. Lastly, during the evaluation of ReScue, we observe that long latency of accessing memory within a CXL memory module can limit the memory access bandwidth and even incur system hangs. Uncovering that these are caused by underlying limitations of AXI (widely used to connect CXL and memory controller IPs) when interfaced with CXL IPs, we propose a solution to prevent the system hangs.
Mon 2 FebDisplayed time zone: Hobart change
11:30 - 12:50 | DRAM Security and ReliabilityMain Conference at Collaroy Chair(s): Saugata Ghose University of Illinois Urbana-Champaign | ||
11:30 20mTalk | MIRZA: Efficiently Mitigating Rowhammer with Randomization and ALERT Main Conference Hritvik Taneja Georgia Tech, Ali Hajiabadi ETH Zurich, Michele Marazzi ABB Research, Kaveh Razavi ETH Zürich, Moinuddin K. Qureshi Georgia Tech | ||
11:50 20mTalk | SALT: Track-and-Mitigate Subarrays, Not Rows, for Blast-Radius-Free Rowhammer Defense Main Conference Moinuddin K. Qureshi Georgia Tech | ||
12:10 20mTalk | ReScue: Reliable and Secure CXL Memory Main Conference Chihun Song UIUC, Austin Antony Cruz UIUC, Michael Jaemin Kim Meta, Minbok Wi Seoul National University, Gaohan Ye UIUC, Kyungsan Kim Samsung Electronics, Sangyeol Lee Samsung Electronics, Jung Ho Ahn Seoul National University, Nam Sung Kim UIUC | ||
12:30 20mTalk | Secret Caching Sauce for High-Performance Secure Memory Main Conference Xu Jiang Huazhong University of Science and Technology, Xueliang Wei Huazhong University of Science and Technology, YiFei Qu Huazhong University of Science and Technology, Dan Feng Huazhong University of Science and Technology, China, Yulai Xie Huazhong University of Science and Technology, Wei Tong Huazhong University of Science and Technology, China | ||