Open this publication in new window or tab >>Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
Jiangsu Huachuang Microsystem Company Ltd., Nanjing, Jiangsu, China.
Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems.
Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
Nanjing University, School of Integrated Circuits, and School of Electronic Science and Engineering, Nanjing, Jiangsu, China, 210023, Jiangsu; Nanjing University, Interdisciplinary Research Center for Future Intelligent Chips (Chip-X), Suzhou, China.
Show others...
2024 (English)In: IEEE Transactions on Computers, ISSN 0018-9340, E-ISSN 1557-9956, Vol. 73, no 12, p. 2882-2896Article in journal (Refereed) Published
Abstract [en]
Choices of dataflows, which are known as intra-core neural network (NN) computation loop nest scheduling and inter-core hardware mapping strategies, play a critical role in the performance and energy efficiency of NoC-based neural network accelerators. Confronted with an enormous dataflow exploration space, this paper proposes an automatic framework for generating and optimizing the full-layer-mappings based on two reinforcement learning algorithms including A2C and PPO. Combining soft and hard constraints, this work transforms the mapping configuration into a sequential decision problem and aims to explore the performance and energy efficient hardware mapping for NoC systems. We evaluate the performance of the proposed framework on 10 experimental neural networks. The results show that compared with the direct-X mapping, the direct-Y mapping, GA-base mapping, and NN-aware mapping, our optimization framework reduces the average execution time of 10 experimental NNs by 9.09%, improves the throughput by 11.27%, reduces the energy by 12.62%, and reduces the time-energy-product (TEP) by 14.49%. The results also show that the performance enhancement is related to the coefficient of variation of the neural network to be computed.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
hardware mapping, Network-on-chip, neural networks, reinforcement learning
National Category
Embedded Systems
Identifiers
urn:nbn:se:kth:diva-367187 (URN)10.1109/TC.2024.3441822 (DOI)001351576000018 ()2-s2.0-85201268286 (Scopus ID)
Note
QC 20250716
2025-07-162025-07-162025-07-16Bibliographically approved