Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Area and Performance Optimization of Barrier Synchronization on Multi-core Network-on-Chips
KTH, School of Information and Communication Technology (ICT), Electronic Systems.
KTH, School of Information and Communication Technology (ICT), Electronic Systems.ORCID iD: 0000-0003-0061-3475
KTH, School of Information and Communication Technology (ICT), Electronic Systems.
2010 (English)In: 3rd IEEE International Conference on Computer and Electrical Engineering (ICCEE), 2010Conference paper, Published paper (Refereed)
Abstract [en]

Barrier synchronization is commonly and widelyused to synchronize the execution of parallel processor coreson multi-core Network-on-Chips (NoCs). Since its globalnature may cause heavy serialization resulting in largeperformance penalty, barrier synchronization should becarefully designed to have low latency communication and tominimize overall completion time. Therefore, in the paper, wepropose a fast barrier synchronization mechanism, targetingMulti-core NoCs. The fast barrier synchronization mechanismincludes a dedicated hardware module, named Fast BarrierSynchronizer (FBS), integrated with each processor node. Itoffers a set of barrier counters and can concurrently processsynchronization requests issued by the local node and remotenodes via the on-chip network. The salient feature of our fastbarrier synchronization mechanism is that, once the barriercondition is reached, the “barrier release” acknowledgement isrouted to all processor nodes in a broadcast way in order tosave chip area by avoiding storing source node informationand to minimize completion time by avoiding serialization ofbarrier releasing. Synthesis results suggest that the FBS canrun over 1 GHz in SMIC® 130nm technology with small areaoverhead. We implemented a FBS-enhanced multi-core NoCarchitecture on our FPGA platform using the Xilinx® Virtex 5as the FPGA chip. FPGA utilization and simulation resultsshow that our fast barrier synchronization demonstrates botharea and performance advantages over the barriersynchronization counterpart with unicast barrier releasing.

Place, publisher, year, edition, pages
2010.
Keyword [en]
Barrier Synchronization; Multi-core; Network-on- Chips (NoCs)
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-63630OAI: oai:DiVA.org:kth-63630DiVA: diva2:482838
Conference
2010 3rd IEEE International Conference on Computer and Electrical Engineering (ICCEE 2010). Chengdu, China. 16 - 18 November 2010
Note
Key: Nostrum. QC 20120425Available from: 2012-01-24 Created: 2012-01-24 Last updated: 2012-04-25Bibliographically approved

Open Access in DiVA

No full text

Other links

http://web.it.kth.se/~axel/papers/2010/ICCEE-XiaowenChen.pdf

Authority records BETA

Lu, Zhonghai

Search in DiVA

By author/editor
Chen, XiaowenLu, ZhonghaiJantsch, Axel
By organisation
Electronic Systems
Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 66 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf