Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
MTP-caffe: Memory, timing, and power aware tool for mapping CNNs to GPUs
KTH.
KTH, School of Information and Communication Technology (ICT), Electronics.
KTH, School of Information and Communication Technology (ICT), Electronics.
KTH, School of Information and Communication Technology (ICT), Electronics.
2017 (English)In: ACM International Conference Proceeding Series, Association for Computing Machinery (ACM), 2017, 31-36 p.Conference paper (Refereed)
Abstract [en]

In the recent past, the Convolutional Neural Networks (CNNs) have attracted intense research. The high processing requirements (of CNNs) and the availability of efficient mapping tools have made GPUs a popular CNN accelerator. To extract the maximum performance, the mapping tools transform the unsupported convolutions to GPU supported matrix multiplications. However, this transformation incurs significant memory overheads (3-5X). Furthermore, since the tool is unaware of the GPU architecture, even after the transformation the performance and power is sub-optimal. To tackle this problem we present MTP-Caffe that complements Caffe by making it memory, timing, and power aware. It analyses the CNN structure and the GPU architecture to convert a CNN into smaller parts, tailored for GPU resources. Simulation results reveal that MTP-Caffe not only eliminates the additional memory overheads but also provides up to 21% speedup and up to 23.5% less power.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2017. 31-36 p.
National Category
Embedded Systems
Identifiers
URN: urn:nbn:se:kth:diva-208445DOI: 10.1145/3029580.3029585ScopusID: 2-s2.0-85016469511ISBN: 9781450348775 OAI: oai:DiVA.org:kth-208445DiVA: diva2:1107185
Conference
8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-Core Architectures and 6th Workshop on Design Tools and Architectures For Multicore Embedded Computing Platforms, PARMA-DITAM 2017, Stockholm, Sweden, 25 January 2017
Note

QC 20170609

Available from: 2017-06-09 Created: 2017-06-09 Last updated: 2017-06-09Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Jafri, SyedHemani, AhmedStathis, Dimitrios
By organisation
KTHElectronics
Embedded Systems

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 1 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf