Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Exploiting DMA for Performance and Energy Optimized STREAM on a DSP
KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
University of Houston.
KTH, Skolan för datavetenskap och kommunikation (CSC), Centra, Parallelldatorcentrum, PDC.
Texas Instruments.
Vise andre og tillknytning
2014 (engelsk)Inngår i: IPDPSW ’14: Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014, s. 805-814Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Energy efficiency is of major concern in HPC.DSP architectures have the potential to offer highly competitiveenergy efficiency for applications requiring 64-bit floatingpointprecision. For STREAM, we achieved 1.47GB/J energy efficiency and 96% DDR3 memory bandwidth utilization on the Texas Instruments TMS320C6678 DSP by using its DMAengines for prefetching to avoid cache misses, which cause pipeline stalls in the DSP’s cores, and to prevent write-allocate loads, which would significantly reduce performance. The DMA engines were also used to coordinate the DSPs cores and schedule main memory accesses to improve DDR3 bandwidth utilization. We briefly describe the instrumentation that we designed and implemented for accurate measurement of the core-related, on-chip memory, and DDR3 power consumption and the effectiveness of the DSP’s power saving mechanisms to trade-off performance and energy efficiency.

sted, utgiver, år, opplag, sider
2014. s. 805-814
HSV kategori
Identifikatorer
URN: urn:nbn:se:kth:diva-163685DOI: 10.1109/IPDPSW.2014.92Scopus ID: 2-s2.0-84918791502ISBN: 978-1-4799-4116-2 (tryckt)OAI: oai:DiVA.org:kth-163685DiVA, id: diva2:801801
Konferanse
2014 IEEE International Parallel Distributed Processing Symposium Workshops
Forskningsfinansiär
EU, FP7, Seventh Framework Programme, RI-261557EU, FP7, Seventh Framework Programme, RI-283493Swedish National Infrastructure for Computing (SNIC)
Merknad

QC 20150413

Tilgjengelig fra: 2015-04-10 Laget: 2015-04-10 Sist oppdatert: 2018-01-11bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Laure, Erwin

Søk i DiVA

Av forfatter/redaktør
Netzer, GilbertAhlin, DanielLaure, Erwin
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 67 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf