NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering

2024 International Symposium on Microarchitecture |

The Compute Express Link (CXL) interconnect makes it feasible to integrate diverse types of memory into servers via its byte-addressable SerDes links. Considering the various access latency, harnessing the full potential of CXL-based heterogeneous memory systems requires efficient memory tiering. However, prior work can hardly make a fundamental progress owing to low-resolution and high-overhead memory access profiling techniques. To address this critical challenge, we propose a novel memory tiering solution called NeoMem, which features a hardware/software co-design. NeoMem offloads memory profiling functions to CXL device-side controllers, integrating a dedicated hardware unit called NeoProf. NeoProf readily monitors memory accesses and provides the OS with crucial page hotness statistics and other useful system state information. On the OS kernel side, we design a revamped memory-tiering strategy, enabling accurate and timely hot page promotion based on NeoProf statistics. We implement NeoMem on a real FPGA-based CXL memory platform and Linux kernel v6.3. Comprehensive evaluations demonstrate that NeoMem achieves 32% to 67% geomean speedup over several existing memory tiering solutions.