Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Exploiting DMA for Performance and Energy Optimized STREAM on a DSP
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.
University of Houston.
KTH, School of Computer Science and Communication (CSC), Centres, Centre for High Performance Computing, PDC.
Texas Instruments.
Show others and affiliations
2014 (English)In: IPDPSW ’14: Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014, 805-814 p.Conference paper, Published paper (Refereed)
Abstract [en]

Energy efficiency is of major concern in HPC.DSP architectures have the potential to offer highly competitiveenergy efficiency for applications requiring 64-bit floatingpointprecision. For STREAM, we achieved 1.47GB/J energy efficiency and 96% DDR3 memory bandwidth utilization on the Texas Instruments TMS320C6678 DSP by using its DMAengines for prefetching to avoid cache misses, which cause pipeline stalls in the DSP’s cores, and to prevent write-allocate loads, which would significantly reduce performance. The DMA engines were also used to coordinate the DSPs cores and schedule main memory accesses to improve DDR3 bandwidth utilization. We briefly describe the instrumentation that we designed and implemented for accurate measurement of the core-related, on-chip memory, and DDR3 power consumption and the effectiveness of the DSP’s power saving mechanisms to trade-off performance and energy efficiency.

Place, publisher, year, edition, pages
2014. 805-814 p.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-163685DOI: 10.1109/IPDPSW.2014.92Scopus ID: 2-s2.0-84918791502ISBN: 978-1-4799-4116-2 (print)OAI: oai:DiVA.org:kth-163685DiVA: diva2:801801
Conference
2014 IEEE International Parallel Distributed Processing Symposium Workshops
Funder
EU, FP7, Seventh Framework Programme, RI-261557EU, FP7, Seventh Framework Programme, RI-283493Swedish National Infrastructure for Computing (SNIC)
Note

QC 20150413

Available from: 2015-04-10 Created: 2015-04-10 Last updated: 2015-04-13Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Laure, Erwin

Search in DiVA

By author/editor
Netzer, GilbertAhlin, DanielLaure, Erwin
By organisation
Centre for High Performance Computing, PDC
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 23 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf