Unlocking on-package memory's effects on high-performance computing's scientific kernels
High-bandwidth memory can improve a computer's performance. On-package memory (OPM) is a popular option in many commercial systems. Before this effort, little was known about OPM's implications on speed and power use. The team experimentally characterized and analyzed modern OPM storage. They provided guidelines on tuning the memory to speed up high-performance computing (HPC) applications.
This study about OPMs is both essential and fundamental for advancing computing systems. For example, it motivates software-architecture co-design exploration. Further, it validates models and simulations. It also has resulted in general optimization guidelines. The work shows how to tune applications and architectures for the best performance on platforms with certain OPMs.
The researchers conducted a thorough experimental evaluation to discern how modern OPMs affected the performance and power efficiency of important HPC scientific kernels, which compose a computer's core operating system. They examined different tuning modes of OPM and how they influenced application tuning for the best system performance. The team from Pacific Northwest National Laboratory, University of Copenhagen, and Virginia Tech evaluated diverse HPC kernels on two Intel OPMs, eDRAM on multicore Broadwell and MCDRAM on manycore Knights Landing, with a large set of their representative input matrices (for example, 968 matrices for sparse kernels). This study allowed the team to derive an intuitive visual analytical model to better explain complex architectural scenarios, as well as provide general guidelines for future architecture optimizations and efficiency tuning.