Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Smilodon: An Efficient Accelerator for Low Bit-Width CNNs with Task Partitioning
Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Jiangsu, Peoples R China..
Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Jiangsu, Peoples R China..
Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Jiangsu, Peoples R China..
Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Jiangsu, Peoples R China..
Show others and affiliations
2019 (English)In: 2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), IEEE , 2019Conference paper, Published paper (Refereed)
Abstract [en]

Convolutional Neural Networks (CNNs) have been widely applied in various fields such as image and video recognition, recommender systems, and natural language processing. However, the massive size and intensive computation loads prevent its feasible deployment in practice, especially on the embedded systems. As a highly competitive candidate, low bit-width CNNs are proposed to enable efficient implementation. In this paper, we propose Smilodon, a scalable, efficient accelerator for low bit-width CNNs based on a parallel streaming architecture, optimized with a task partitioning strategy. We also present the 3D systolic-like computing arrays fitting for convolutional layers. Our design is implemented on Zynq XC7ZO20 FPGA, which can satisfy the needs of real-time with a frame rate of 1, 622 FPS throughput, while consuming 2.1 Watt. To the best of our knowledge, our accelerator is superior to the state-of-the-art works in the tradeoff among throughput, power efficiency, and area efficiency.

Place, publisher, year, edition, pages
IEEE , 2019.
Series
IEEE International Symposium on Circuits and Systems, ISSN 0271-4302
Keywords [en]
Low bit-width CNNs, 3D systolic-like array, task partitioning, parallel streaming architecture
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-260226DOI: 10.1109/ISCAS.2019.8702547ISI: 000483076402015Scopus ID: 2-s2.0-85066787463ISBN: 978-1-7281-0397-6 (print)OAI: oai:DiVA.org:kth-260226DiVA, id: diva2:1355394
Conference
IEEE International Symposium on Circuits and Systems (IEEE ISCAS), MAY 26-29, 2019, Sapporo, JAPAN
Note

QC 20190927

Available from: 2019-09-27 Created: 2019-09-27 Last updated: 2019-09-27Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Lu, Zhonghai

Search in DiVA

By author/editor
Lu, Zhonghai
By organisation
Electronics
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 9 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf