Open this publication in new window or tab >>RIKEN Center for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo, Japan, 650-0047, 7-1-26 Minatojima-minamimachi, Chuo-ku, Hyogo.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS. KTH Royal Institute of Technology, Brinellvägen 8, Stockholm, Stockholm, Sweden, 114 28, Brinellvägen 8, Stockholm.
Indian Institute of Technology, Roorkee - Haridwar Highway, Roorkee, Uttarakhand, India, Roorkee - Haridwar Highway, Uttarakhand.
Chalmers University of Technology, Chalmersplatsen 4, Göteborg, Västra Götaland, Sweden, 412 96, Chalmersplatsen 4, Västra Götaland.
Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo, Tokyo, 2-12-1 Ookayama, Meguro-ku, Tokyo.
National Institute of Advanced Industrial Science and Technology, 1-8-31 Midorigaoka, Ikeda-ku, Osaka, Osaka, Japan, 563-0026, 1-8-31 Midorigaoka, Ikeda-ku, Osaka.
RIKEN Center for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo, Japan, 650-0047, 7-1-26 Minatojima-minamimachi, Chuo-ku, Hyogo.
RIKEN Center for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo, Japan, 650-0047, 7-1-26 Minatojima-minamimachi, Chuo-ku, Hyogo.
Show others...
2023 (English)In: ACM Transactions on Architecture and Code Optimization (TACO), ISSN 1544-3566, E-ISSN 1544-3973, Vol. 20, no 4, article id 57Article in journal (Refereed) Published
Abstract [en]
Over the last three decades, innovations in the memory subsystem were primarily targeted at overcoming the data movement bottleneck. In this paper, we focus on a specific market trend in memory technology: 3D-stacked memory and caches. We investigate the impact of extending the on-chip memory capabilities in future HPC-focused processors, particularly by 3D-stacked SRAM. First, we propose a method oblivious to the memory subsystem to gauge the upper-bound in performance improvements when data movement costs are eliminated. Then, using the gem5 simulator, we model two variants of a hypothetical LARge Cache processor (LARC), fabricated in 1.5 nm and enriched with high-capacity 3D-stacked cache. With a volume of experiments involving a broad set of proxy-applications and benchmarks, we aim to reveal how HPC CPU performance will evolve, and conclude an average boost of 9.56× for cache-sensitive HPC applications, on a per-chip basis. Additionally, we exhaustively document our methodological exploration to motivate HPC centers to drive their own technological agenda through enhanced co-design.
Place, publisher, year, edition, pages
Association for Computing Machinery, 2023
Keywords
3D-stacked memory, Emerging architecture study, gem5 simulation, proxy-applications
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-342395 (URN)10.1145/3629520 (DOI)001153375300012 ()2-s2.0-85181487217 (Scopus ID)
Note
QC 20240118
2024-01-172024-01-172024-02-26Bibliographically approved