Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Parallel distributed scalable runtime address generation scheme for a coarse grain reconfigurable computation and storage fabric
KTH, School of Information and Communication Technology (ICT), Electronic Systems. (ESY ELECTRONICS AND EMBEDDED SYSTEMS)
KTH, School of Information and Communication Technology (ICT), Electronic Systems. (ESY ELECTRONICS AND EMBEDDED SYSTEMS)ORCID iD: 0000-0003-0565-9376
KTH, School of Information and Communication Technology (ICT), Electronic Systems. (ESY ELECTRONICS AND EMBEDDED SYSTEMS)
KTH, School of Information and Communication Technology (ICT), Electronic Systems.
Show others and affiliations
2014 (English)In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 38, no 8, 788-802 p.Article in journal (Refereed) Published
Abstract [en]

This paper presents a hardware based solution for a scalable runtime address generation scheme for DSP applications mapped to a parallel distributed coarse grain reconfigurable computation and storage fabric. The scheme can also deal with non-affine functions of multiple variables that typically correspond to multiple nested loops. The key innovation is the judicious use of two categories of address generation resources. The first category of resource is the low cost AGU that generates addresses for given address bounds for affine functions of up to two variables. Such low cost AGUs are distributed and associated with every read/write port in the distributed memory architecture. The second category of resource is relatively more complex but is also distributed but shared among a few storage units and is capable of handling more complex address generation requirements like dynamic computation of address bounds that are then used to configure the AGUs, transformation of non-affine functions to affine function by computing the affine factor outside the loop, etc. The runtime computation of the address constraints results in negligibly small overhead in latency, area and energy while it provides substantial reduction in program storage, reconfiguration agility and energy compared to the prevalent pre-computation of address constraints. The efficacy of the proposed method has been validated against the prevalent address generation schemes for a set of six realistic DSP functions. Compared to the pre-computation method, the proposed solution achieved 75% average code compaction and compared to the centralized runtime address generation scheme, the proposed solution achieved 32.7% average performance improvement.

Place, publisher, year, edition, pages
2014. Vol. 38, no 8, 788-802 p.
Keyword [en]
Streaming address generation, CGRA, Parallel distributed DSP, Code compaction
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-159997DOI: 10.1016/j.micpro.2014.05.009ISI: 000347755200006Scopus ID: 2-s2.0-84910626332OAI: oai:DiVA.org:kth-159997DiVA: diva2:790664
Note

QC 20150225

Available from: 2015-02-25 Created: 2015-02-12 Last updated: 2017-12-04Bibliographically approved
In thesis
1. SiLago: Enabling System Level Automation Methodology to Design Custom High-Performance Computing Platforms: Toward Next Generation Hardware Synthesis Methodologies
Open this publication in new window or tab >>SiLago: Enabling System Level Automation Methodology to Design Custom High-Performance Computing Platforms: Toward Next Generation Hardware Synthesis Methodologies
2016 (English)Doctoral thesis, comprehensive summary (Other academic)
Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2016. 56 p.
Series
TRITA-ICT, 2016:05
Keyword
System Level Synthesis, High Level Synthesis, VLSI Design Methodology, Brain-like Computation, Neuromorphic Hardware, Address Generation, Thread Level Parallelism
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
urn:nbn:se:kth:diva-185787 (URN)978-91-7595-900-9 (ISBN)
Public defence
2016-05-17, Sal B, Electrum 229, Isafjordsgatan 22, Kista, Stockholm, 20:24 (English)
Opponent
Supervisors
Note

QC 20160428

Available from: 2016-04-28 Created: 2016-04-27 Last updated: 2016-04-28Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Authority records BETA

Hemani, Ahmed

Search in DiVA

By author/editor
Farahini, NasimHemani, AhmedSohofi, HassanJafri, Syed M. A. H.Tajammul, Muhammad Adeel
By organisation
Electronic Systems
In the same journal
Microprocessors and microsystems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 63 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf