CNN Sensor Analytics With Hybrid-Float6 Quantization on Low-Power Embedded FPGAsShow others and affiliations
2023 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 11, p. 4852-4868
Article in journal (Refereed) Published
Abstract [en]
The use of artificial intelligence (AI) in sensor analytics is entering a new era based on the use of ubiquitous embedded connected devices. This transformation requires the adoption of design techniques that reconcile accurate results with sustainable system architectures. As such, improving the efficiency of AI hardware engines as well as backward compatibility must be considered. In this paper, we present the Hybrid-Float6 (HF6) quantization and its dedicated hardware design. We propose an optimized multiply-accumulate (MAC) hardware by reducing the mantissa multiplication to a multiplexor-adder operation. We exploit the intrinsic error tolerance of neural networks to further reduce the hardware design with approximation. To preserve model accuracy, we present a quantization-aware training (QAT) method, which in some cases improves accuracy. We demonstrate this concept in 2D convolution layers. We present a lightweight tensor processor (TP) implementing a pipelined vector dot-product. For compatibility and portability, the 6-bit floating-point (FP) is wrapped in the standard FP format, which is automatically extracted by the proposed hardware. The hardware/software architecture is compatible with TensorFlow (TF) Lite. We evaluate the applicability of our approach with a CNN-regression model for anomaly localization in a structural health monitoring (SHM) application based on acoustic emission (AE). The embedded hardware/software framework is demonstrated on XC7Z007S as the smallest Zynq-7000 SoC. The proposed implementation achieves a peak power efficiency and run-time acceleration of 5.7 GFLOPS/s/W and 48.3x, respectively.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2023. Vol. 11, p. 4852-4868
Keywords [en]
Hardware, Quantization (signal), Field programmable gate arrays, Tensors, Computational modeling, Convolutional neural networks, Computer architecture, structural health monitoring, hardware accelerator, TensorFlow Lite, embedded systems, FPGA, custom floating-point
National Category
Computer Systems Embedded Systems
Identifiers
URN: urn:nbn:se:kth:diva-324310DOI: 10.1109/ACCESS.2023.3235866ISI: 000917229200001Scopus ID: 2-s2.0-85147312363OAI: oai:DiVA.org:kth-324310DiVA, id: diva2:1739635
Note
QC 20230227
2023-02-272023-02-272023-02-27Bibliographically approved