Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks
KTH, School of Information and Communication Technology (ICT), Electronics.
KTH, School of Information and Communication Technology (ICT), Electronics.
KTH, School of Information and Communication Technology (ICT), Electronics.
2017 (English)In: Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium, IPDPS 2017, Institute of Electrical and Electronics Engineers (IEEE), 2017, p. 276-286, article id 7967117Conference paper, Published paper (Refereed)
Abstract [en]

Today, machine learning based on neural networks has become mainstream, in many application domains. A small subset of machine learning algorithms, called Convolutional Neural Networks (CNN), are considered as state-ofthe-A rt for many applications (e.g. video/audio classification). The main challenge in implementing the CNNs, in embedded systems, is their large computation, memory, and bandwidth requirements. To meet these demands, dedicated hardware accelerators have been proposed. Since memory is the major cost in CNNs, recent accelerators focus on reducing the memory accesses. In particular, they exploit data locality using either tiling, layer merging or intra/inter feature map parallelism to reduce the memory footprint. However, they lack the flexibility to interleave or cascade these optimizations. Moreover, most of the existing accelerators do not exploit compression that can simultaneously reduce memory requirements, increase the throughput, and enhance the energy efficiency. To tackle these limitations, we present a flexible accelerator called MOCHA. MOCHA has three features that differentiate it from the state-of-the-art: (i) the ability to compress input/kernels, (ii) the flexibility to interleave various optimizations, and (iii) intelligence to automatically interleave and cascade the optimizations, depending on the dimension of a specific CNN layer and available resources. Post layout Synthesis results reveal that MOCHA provides up to 63% higher energy efficiency, up to 42% higher throughput, and up to 30% less storage, compared to the next best accelerator, at the cost of 26-35% additional area.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2017. p. 276-286, article id 7967117
Keywords [en]
Accelerators, Computer architecture, Convolutional Neural Networks, Reconfigurable computing
National Category
Other Engineering and Technologies
Identifiers
URN: urn:nbn:se:kth:diva-213209DOI: 10.1109/IPDPS.2017.59ISI: 000427044800029Scopus ID: 2-s2.0-85027691759ISBN: 9781538639146 OAI: oai:DiVA.org:kth-213209DiVA, id: diva2:1137509
Conference
31st IEEE International Parallel and Distributed Processing Symposium, IPDPS 2017, Orlando, United States, 29 May 2017 through 2 June 2017
Note

QC 20170831

Available from: 2017-08-31 Created: 2017-08-31 Last updated: 2018-04-03Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Jafri, SyedHemani, AhmedPaul, Kolin
By organisation
Electronics
Other Engineering and Technologies

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 81 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf