Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance Analysis and Design Space Exploration of On-Chip Interconnection Networks
KTH, School of Information and Communication Technology (ICT), Electronic Systems.
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The advance of semiconductor technology, which has led to more than one billion transistors on a single chip, has enabled designers to integrate dozens of IP (intellectual property) blocks together with large amounts of embedded memory. These advances, along with the fact that traditional communication architectures do not scale well have led to significant changes in the architecture and design of integrated circuits. One solution to these problems is to implement such a complex system using an on-chip interconnection network or network-on-chip (NoC). The multiple concurrent connections of such networks mean that they have extremely high bandwidth. Regularity can lead to design modularity providing a standard interface for easier component reuse and improved interoperability.

The present thesis addresses the performance analysis and design space exploration of NoCs using analytical and simulation-based performance analysis approaches. At first, we developed a simulator aimed to performance analysis of interconnection networks. The simulator is then used to evaluate the performance of networks topologies and routing algorithms since their choice heavily affect the performance of NoCs. Then, we surveyed popular mathematical formalisms – queueing theory, network calculus, schedulability analysis, and dataflow analysis – and how they have been applied to the analysis of on-chip communication performance in NoCs. We also addressed research problems related to modelling and design space exploration of NoCs.

In the next step, analytical router models were developed that analyse NoC performance. In addition to providing aggregate performance metrics such as latency and throughput, our approach also provides feedback about the network characteristics at a fine-level of granularity. Our approach explicates the impact that various design parameters have on the performance, thereby providing invaluable insight into NoC design. This makes it possible to use the proposed models as a powerful design and optimisation tool.

We then used the proposed analytical models to address the design space exploration and optimisation problem. System-level frameworks to address the application mapping and to design routing algorithms for NoCs were presented. We first formulated an optimisation problem of minimizing average packet latency in the network, and then solved this problem using the simulated annealing heuristic. The proposed framework can also address other design space exploration problems such as topology selection and buffer dimensioning.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2013. , xxi, 37 p.
Series
Trita-ICT-ECS AVH, ISSN 1653-6363 ; 13:21
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-136409ISBN: 978-91-7501-923-9 (print)OAI: oai:DiVA.org:kth-136409DiVA: diva2:676041
Public defence
2013-12-18, Sal/Hall D, Forum, KTH-ICT, Isafjordsgatan 39, Kista, 13:00 (English)
Opponent
Supervisors
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2013-12-05Bibliographically approved
List of papers
1. A performance and power analysis of WK-recursive and mesh networks for network-on-chips
Open this publication in new window or tab >>A performance and power analysis of WK-recursive and mesh networks for network-on-chips
2006 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Network-on-Chip (NoC) has been proposed as an attractive alternative to traditional dedicated wires to achieve high performance and modularity. Power efficiency is one of the most important concerns in NoC architecture design. The choice of network topology is important in designing a low-power and high-performance NoC. In this paper, we propose the use of the WK-recursive networks to be used as the underlying topology in NoC. We have implemented VHDL hardware model of mesh and WK-recursive topologies and measured the latency results using simulation with these implementation. We also propose a novel approach in high level power modeling based on latency for these topologies and show that the power consumption of WK-recursive topology is less than that of the equivalent mesh on a chip.

Keyword
Computer hardware description languages; Mesh generation; Recursive functions; Routers; Topology, Computer designs; Hardware models; International conferences; Low powers; Mesh; Mesh networks; Network topologies; Network-on-chip; Network-on-chips; NOC architectures; Performance; Power; Power analysis; Power consumption; Power efficiencies; Power modeling; Recursive topologies; Routing; System-on-chips; WK-recursive mesh, Electric network topology
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136321 (URN)10.1109/ICCD.2006.4380807 (DOI)000252361000024 ()2-s2.0-38149099704 (Scopus ID)
Conference
24th International Conference on Computer Design 2006, ICCD; San Jose, CA; United States
Note

QC 20131204

Available from: 2013-12-04 Created: 2013-12-04 Last updated: 2013-12-05Bibliographically approved
2. A Markovian performance model for networks-on-chip
Open this publication in new window or tab >>A Markovian performance model for networks-on-chip
2007 (English)In: Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2007, 157-164 p.Conference paper, Published paper (Refereed)
Abstract [en]

Network-on-Chip (NoC) has been proposed as a solution for addressing the design challenges of future high-performance nanoscale architectures. Thus, it is of crucial importance for a designer to have access to fast methods for evaluating the performance of on-chip network. To this end, we present a Markovian model for evaluating the latency and energy consumption of on-chip networks, We compute the average delay due to path contention, virtual channel and crossbar switch arbitration using a queuing-based approach, which can capture the blocking phenomena of wormhole switching quite accurately. The model is then used to estimate the power consumption of all routers m NoCs The performance results from the analytical models are validated with those obtained from a synthesizable VHDL-based cycle accurate simulator Comparison with simulation results indicate that the proposed analytical model quite accurate and can be used as an efficient design tool by SoC designers.

Series
Euromicro Workshop on Parallel and Distributed Processing, ISSN 1066-6192
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136400 (URN)10.1109/PDP.2008.83 (DOI)000254266500022 ()2-s2.0-47349131822 (Scopus ID)978-0-7695-3089-5 (ISBN)
Conference
16th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, FEB 13-15, 2008, Toulouse, FRANCE
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2013-12-05Bibliographically approved
3. PERMAP: A Performance-Aware Mapping for Application-Specific SoCs
Open this publication in new window or tab >>PERMAP: A Performance-Aware Mapping for Application-Specific SoCs
2008 (English)In: 2008 International Conference on Application-Specific Systems, Architectures and Processors, 2008, 73-78 p.Conference paper, Published paper (Refereed)
Abstract [en]

Future System-on-Chip (SoC) designs will need efficient on-chip communication architectures that can provide efficient and scalable data transport among the Intellectual Properties (IPs). Designing and optimizing SoCs is an increasingly difficult task due to the size and complexity of the SoC design space, high cost of detailed simulation, and several constraints that the design must satisfy. For efficient design of SoCs, an efficient mapping of IPs onto Networks-on-Chip (NoCs) is highly desirable. Towards this end, we have presented PERMAP, a PERformance-aware MAPping algorithm which maps the IPs onto a generic NoC architecture such that the average communication delay is minimized This is accomplished by a performance analytical model which can be used for any arbitrary network topology with wormhole routing, The algorithm is used for mapping a video application onto a tile-based NoC and experimental results show that PERMAP is fast and robust.

Series
IEEE INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, ISSN 1063-6862
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136401 (URN)10.1109/ASAP.2008.4580157 (DOI)000260566200013 ()2-s2.0-51649113422 (Scopus ID)978-1-4244-1897-8 (ISBN)
Conference
19th IEEE International Conference on Application-Specific Systems, Architectures and Processors, JUL 02-04, 2008, Leuven, BELGIUM
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2013-12-05Bibliographically approved
4. Caspian: A tunable performance model for multi-core systems
Open this publication in new window or tab >>Caspian: A tunable performance model for multi-core systems
2008 (English)In: 14th International Euro-Par Conference, Euro-Par 2008, 2008, 100-109 p.Conference paper, Published paper (Refereed)
Abstract [en]

Performance evaluation is an important engineering tool that provides valuable feedback on design choices in the implementation of multi-core systems such as parallel systems, multicomputers, and Systems-on-Chip (SoCs). The significant advantage of analytical models over simulation is that they can be used to obtain performance results for large systems under different configurations and working conditions which may not be feasible to study using simulation on conventional computers due to the excessive computation demands. We present Caspian(1), a novel analytic performance model, aimed to minimize prediction cost, while providing prediction accuracy. This is accomplished by using a G/G/1 priority queueing model which is used for arbitrary network topology with wormhole routing under arbitrary traffic pattern. The accuracy of this model is examined through extensive simulation results.

Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 5168
Keyword
performance evaluation, analytical model, multi-core systems, G/G/1 queueing model
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136402 (URN)10.1007/978-3-540-85451-7_12 (DOI)000259084900009 ()2-s2.0-51849140540 (Scopus ID)978-3-540-85450-0 (ISBN)
Conference
14th International Euro-Par Conference, AUG 26-29, 2008, Las Palmas de Gran Canaria, SPAIN
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2013-12-05Bibliographically approved
5. Power-Efficient Routing Algorithm for Torus NoCs
Open this publication in new window or tab >>Power-Efficient Routing Algorithm for Torus NoCs
2008 (English)In: Proceedings of the International Conference on Contemporary Computing (IC3), 2008, 211-220 p.Conference paper, Published paper (Refereed)
Abstract [en]

Modern System-on-Chip (SoC) architectures use Network-on-Chip (NoC) for high-speed internode communication. NoC with torus interconnection topology is now popular due to its low dimension and simple structure. Torus NoC is very similar to the mesh NoC from a structural point of view, but has rather smaller diameter that makes it a suitable choice for NoCs. For a routing algorithm to be deadlock-free in a torus NoC at least two virtual channels should be used to avoid channel dependency, while mesh NoC can handle deadlock freedom using only one virtual channel. In this paper, we propose a novel approach on designing routing algorithms for mesh and torus NoCs. Also a deadlock free routing algorithm is proposed for Torus NoC that uses only one virtual channel per physical channel resulting in lower power consumption because of reduced hardware complexity and with no significant performance degradation. The algorithm works within a dimension and is applied to all dimensions individually for XY routing and various turn based deterministic routing algorithms like west first, north last and negative first. We have proved efficiency of the algorithm using simulation results obtained from synthesis of our implemented VHDL Register Transfer Level (RTL) model of NoC.

Keyword
SoC, NoC, Torus, Mesh, Performance, Power Consumption, Routing, Virtual Channel, Deadlock, VHDL RTL model
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136415 (URN)
Conference
the International Conference on Contemporary Computing (IC3), Uttar Pradesh, India, Aug. 2008
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2015-02-06Bibliographically approved
6. A Framework for Designing Congestion-Aware Deterministic Routing
Open this publication in new window or tab >>A Framework for Designing Congestion-Aware Deterministic Routing
2010 (English)In: NoCArc '10 Proceedings of the Third International Workshop on Network on Chip Architectures, 2010, 45-50 p.Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we present a system-level Congestion-Aware Routing (CAR) framework for designing minimal deterministic routing algorithms. CAR exploits the peculiarities of the application workload to spread the load evenly across the network. To this end, we first formulate an optimization problem of minimizing the level of congestion in the network and then use the simulated annealing heuristic to solve this problem. The proposed framework assures deadlock-free routing, even in the networks without virtual channels. Experiments with both synthetic and realistic workloads show the effectiveness of the CAR framework. Results show that maximum sustainable throughput of the network is improved by up to 205% for different applications and architectures.

Keyword
Algorithms, C.2.1 [Network Architecture and Design], Network communications, Design, Performance
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-63728 (URN)10.1145/1921249.1921261 (DOI)2-s2.0-79951614608 (Scopus ID)978-145030397-2 (ISBN)
Conference
3rd International Workshop on Network on Chip Architectures, NoCArc 2010, in Conjunction with the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-43; Atlanta, GA; 4 December 2010 through 4 December 2010
Note

QC 20120203

Available from: 2012-01-24 Created: 2012-01-24 Last updated: 2013-12-05Bibliographically approved
7. Analytical approaches for performance evaluation of networks-on-chip
Open this publication in new window or tab >>Analytical approaches for performance evaluation of networks-on-chip
Show others...
2012 (English)In: CASES'12 - Proceedings of the 2012 ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems, Co-located with ESWEEK, ACM , 2012, 211-212 p.Conference paper, Published paper (Refereed)
Abstract [en]

This tutorial reviews four popular mathematical formalisms - dataflow analysis, schedulability analysis, network calculus, and queueing theory - and how they have been applied to the analysis of Network-on-Chip (NoC) performance. We review the basic concepts and results of each formalism and provide examples of how they have been used in on-chip communication performance analysis. The tutorial also discusses the respective strengths and weaknesses of each formalism, their suitability for a specific purpose, and the attempts that have been made to bridge these analytical approaches. Finally, we conclude the tutorial by discussing open research issues.

Place, publisher, year, edition, pages
ACM, 2012
Keyword
Analytical modeling, Design methodology, Network-on-chip, Performance evaluation, System-on-chip
National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-114012 (URN)10.1145/2380403.2380442 (DOI)000324163200028 ()2-s2.0-84869071319 (Scopus ID)978-145031424-4 (ISBN)
Conference
2012 ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems, CASES 2012, Co-located with 8th Embedded Systems Week, ESWEEK 2012, 7 October 2012 through 12 October 2012, Tampere, Finland
Note

QC 20130115

Available from: 2013-01-15 Created: 2013-01-15 Last updated: 2013-12-06Bibliographically approved
8. Power-efficient deterministic and adaptive routing in torus networks-on-chip
Open this publication in new window or tab >>Power-efficient deterministic and adaptive routing in torus networks-on-chip
2012 (English)In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 36, no 7, 571-585 p.Article in journal (Refereed) Published
Abstract [en]

Modern SoC architectures use NoCs for high-speed inter-IP communication. For NoC architectures, high-performance efficient routing algorithms with low power consumption are essential for real-time applications. NoCs with mesh and torus interconnection topologies are now popular due to their simple structures. A torus NoC is very similar to the mesh NoC, but has rather smaller diameter. For a routing algorithm to be deadlock-free in a torus, at least two virtual channels per physical channel must be used to avoid cyclic channel dependencies due to the warp-around links; however, in a mesh network deadlock freedom can be insured using only one virtual channel. The employed number of virtual channels is important since it has a direct effect on the power consumption of NoCs. In this paper, we propose a novel systematic approach for designing deadlock-free routing algorithms for torus NoCs. Using this method a new deterministic routing algorithm (called TRANC) is proposed that uses only one virtual channel per physical channel in torus NoCs. We also propose an algorithmic mapping that enables extracting TRANC-based routing algorithms from existing routing algorithms, which can be both deterministic and adaptive. The simulation results show power consumption and performance improvements when using the proposed algorithms.

Place, publisher, year, edition, pages
Elsevier, 2012
Keyword
Adaptive, Deadlock, Deterministic, Mesh, NoC, Performance evaluation, Power consumption, Routing, SoC, Torus, VHDL, Virtual channel
National Category
Communication Systems
Identifiers
urn:nbn:se:kth:diva-90828 (URN)10.1016/j.micpro.2011.05.009 (DOI)000309441900005 ()2-s2.0-84865838319 (Scopus ID)
Note

QC 20121017

Available from: 2012-02-29 Created: 2012-02-29 Last updated: 2017-12-07Bibliographically approved
9. An Analytical Latency Model for Networks-on-Chip
Open this publication in new window or tab >>An Analytical Latency Model for Networks-on-Chip
2013 (English)In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 21, no 1, 113-123 p.Article in journal (Refereed) Published
Abstract [en]

We propose an analytical model based on queueing theory for delay analysis in a wormhole-switched network-on-chip (NoC). The proposed model takes as input an application communication graph, a topology graph, a mapping vector, and a routing matrix, and estimates average packet latency and router blocking time. It works for arbitrary network topology with deterministic routing under arbitrary traffic patterns. This model can estimate per-flow average latency accurately and quickly, thus enabling fast design space exploration of various design parameters in NoC designs. Experimental results show that the proposed analytical model can predict the average packet latency more than four orders of magnitude faster than an accurate simulation, while the computation error is less than 10% in non-saturated networks for different system-on-chip platforms.

Keyword
Modeling and prediction, network-on-chip (NoC), performance analysis and design aids, queueing theory
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-63609 (URN)10.1109/TVLSI.2011.2178620 (DOI)000312835000011 ()2-s2.0-84871803176 (Scopus ID)
Note

QC 20130204

Available from: 2012-01-24 Created: 2012-01-24 Last updated: 2017-12-08Bibliographically approved
10. Mathematical formalisms for performance evaluation of networks-on-chip
Open this publication in new window or tab >>Mathematical formalisms for performance evaluation of networks-on-chip
2013 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 45, no 3, 38- p.Article in journal (Refereed) Published
Abstract [en]

This article reviews four popular mathematical formalisms-queueing theory, network calculus, schedulability analysis, anddataflow analysis-and how they have been applied to the analysis of on-chip communication performance in Systems-on-Chip. The article discusses the basic concepts and results of each formalism and provides examples of how they have been used in Networks-on-Chip (NoCs) performance analysis. Also, the respective strengths and weaknesses of each technique and its suitability for a specific purpose are investigated. An open research issue is a unified analytical model for a comprehensive performance evaluation of NoCs. To this end, this article reviews the attempts that have been made to bridge these formalisms.

Keyword
Analytical modeling, Network-on-chip (NoC), Performance evaluation, System-on-chip (SoC)
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-134288 (URN)10.1145/2480741.2480755 (DOI)000321212300014 ()2-s2.0-84880103807 (Scopus ID)
Note

QC 20131121

Available from: 2013-11-21 Created: 2013-11-20 Last updated: 2017-12-06Bibliographically approved
11. A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs
Open this publication in new window or tab >>A Heuristic Framework for Designing and Exploring Deterministic Routing Algorithm for NoCs
2013 (English)In: Algorithms in Networks-on-Chip, Springer, 2013, 21-39 p.Chapter in book (Refereed)
Abstract [en]

In this chapter, we present a system-level framework for designing minimal deterministic routing algorithms for Networks-on-Chip (NoCs) that are customized for a set of applications. To this end, we first formulate an optimization problem of minimizing average packet latency in the network and then use the simulated annealing heuristic to solve this problem. To estimate the average packet latency we use a queueing-based analytical model which can capture the burstiness of the traffic. The proposed framework does not require virtual channels to guarantee deadlock freedom since routes are extracted from an acyclic channel dependency graph. Experiments with both synthetic and realistic workloads show the effectiveness of the approach. Results show that maximum sustainable throughput of the network is improved for different applications and architectures.

Place, publisher, year, edition, pages
Springer, 2013
National Category
Computer Science
Identifiers
urn:nbn:se:kth:diva-136408 (URN)10.1007/978-1-4614-8274-1_2 (DOI)978-1-4614-8273-4 (ISBN)
Note

QC 20131205

Available from: 2013-12-05 Created: 2013-12-05 Last updated: 2013-12-05Bibliographically approved

Open Access in DiVA

Thesis(1161 kB)560 downloads
File information
File name FULLTEXT01.pdfFile size 1161 kBChecksum SHA-512
b12c2367ee45369085614a40de10ed2bc8224ca75be9b8b48749672ee2c08aa0c67335a2100b8c6358fec5b2b90a46f699dc3587f88e9ef9b8450e2ff965b6fa
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Eslami Kiasari, Abbas
By organisation
Electronic Systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 560 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 260 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf