Unleashing 8-Bit Floating Point Formats Out of the Deep-Learning Domain
2024 (English)In: 2024 31st IEEE International Conference on Electronics, Circuits and Systems, ICECS 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]
Reduced-precision floating-point (FP) arithmetic is a technology trend to minimize memory usage and execution time on power-constrained devices. This paper explores the potential applications of the 8-bit FP format beyond the classical deep learning use cases. We comprehensively analyze alternative FP8 formats, considering the allocation of mantissa and exponent bits. Additionally, we examine the impact on energy efficiency, accuracy, and execution time of several digital signal processing and classical machine learning kernels using the parallel ultra-low-power (PULP) platform based on the RISC-V instruction set architecture. Our findings show that using appropriate exponent choice and scaling methods results in acceptable errors compared to FP32. Our study facilitates the adoption of FP8 formats outside the deep learning domain to achieve consistent energy efficiency and speed improvements without compromising accuracy. On average, our results indicate speedup of 3.14x, 6.19x, 11.11x, and 18.81x on 1, 2, 4, and 8 cores, respectively. Furthermore, the vectorized implementation of FP8 in the same setup delivers remarkable energy savings of 2.97x, 5.07x, 7.37x, and 15.05x.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024.
Keywords [en]
approximate computing, float8, Parallel ultra-low-power platform, RISC-V, smallFloat data types
National Category
Computer Systems Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-360165DOI: 10.1109/ICECS61496.2024.10848785ISI: 001445799800055Scopus ID: 2-s2.0-85217619865OAI: oai:DiVA.org:kth-360165DiVA, id: diva2:1938782
Conference
31st IEEE International Conference on Electronics, Circuits and Systems, ICECS 2024, Nancy, France, Nov 18 2024 - Nov 20 2024
Note
Part of ISBN 979-8-3503-7720-0
QC 20250224
2025-02-192025-02-192025-05-05Bibliographically approved