Change search
Refine search result
1 - 15 of 15
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1. Daneshtalab, M.
    et al.
    Ebrahimi, M.
    Liljeberg, P.
    Plosila, J.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Memory-Efficient On-Chip Network With Adaptive Interfaces2012In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 31, no 1, p. 146-159Article in journal (Refereed)
    Abstract [en]

    To achieve higher memory bandwidth in network-based multiprocessor architectures, multiple dynamic random access memories can be accessed simultaneously. In such architectures, not only resource utilization and latency are the critical issues but also a reordering mechanism is required to deliver the response transactions of concurrent memory accesses in-order. In this paper, we present a memory-efficient on-chip network architecture to cope with these issues efficiently. Each node of the network is equipped with a novel network interface (NI) to deal with out-of-order delivery, and a priority-based router to decrease the network latency. The proposed NI exploits a streamlined reordering mechanism to handle the in-order delivery and utilizes the advance extensible interface transaction-based protocol to maintain compatibility with existing intellectual property cores. To improve the memory utilization and reduce the memory latency, an optimized memory controller is integrated in the presented NI. Experimental results with synthetic test cases demonstrate that the proposed on-chip network architecture provides significant improvements in average network latency (16%), average memory access latency (19%), and average memory utilization (22%).

  • 2. Ellervee, P.
    et al.
    Miranda, M.
    Catthoor, F.
    Hemani, Ahmed
    KTH, Superseded Departments, Microelectronics and Information Technology, IMIT.
    System-level data-format exploration for dynamically allocated data structures2001In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 20, no 12, p. 1469-1472Article in journal (Refereed)
    Abstract [en]

    System-level exploration of memory organizations is a key issue in successful implementation of data dominated applications based on dynamically allocated data structures involving records and access keys. This paper presents a formalized technique for exploring different memory data-format alternatives when only the system level functional behavior of the application has been defined. Our data-format exploration approach allows to substantially minimize the number of accessed bits by rearranging the format of the data records. The technique exploits parallelism in the data transfer by analyzing the dependencies between data-record accesses. As a result, significant reduction in memory size, bandwidth, and power are obtained. We have validated our techniques using several real-life asynchronous transfer mode cell processing applications, where we have obtained reductions in memory size (up to 20%), power (up to a 60%), and bandwidth.

  • 3.
    Ellervee, Peeter
    et al.
    KTH, Superseded Departments, Microelectronics and Information Technology, IMIT.
    Miranda, Miguel
    IMEC.
    Catthoor, Francky
    IMEC, Katholieke Universiteit.
    Hemani, Ahmed
    KTH, Superseded Departments, Microelectronics and Information Technology, IMIT.
    System-level data-format exploration for dynamically allocated data structures2001In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 20, p. 1469-1472Article in journal (Refereed)
  • 4.
    Jafari, Fahimeh
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Yaghmaee, Mohammad Hossein
    Buffer Optimization in Network-on-Chip Through Flow Regulation2010In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 29, no 12, p. 1973-1986Article in journal (Refereed)
    Abstract [en]

    For network-on-chip (NoC) designs, optimizing buffers is an essential task since buffers are a major source of cost and power consumption. This paper proposes flow regulation and has defined a regulation spectrum as a means for system-on-chip architects to control delay and backlog bounds. The regulation is performed per flow for its peak rate and burstiness. However, many flows may have conflicting regulation requirements due to interferences with each other. Based on the regulation spectrum, this paper optimizes the regulation parameters aiming for buffer optimization. Three timing-constrained buffer optimization problems are formulated, namely, buffer size minimization, buffer variance minimization, and multiobjective optimization, which has both buffer size and variance as minimization objectives. Minimizing buffer variance is also important because it affects the modularity of routers and network interfaces. A realistic case study exhibits 62.8% reduction of total buffers, 84.3% reduction of total latency, and 94.4% reduction on the sum of variances of buffers. Likewise, the experimental results demonstrate similar improvements in the case of synthetic traffic patterns. The optimization algorithm has low run-time complexity, enabling quick exploration of large design spaces. This paper concludes that optimal flow regulation can be a highly valuable instrument for buffer optimization in NoC designs.

  • 5. Long, Yanchen
    et al.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT).
    Shen, Haibin
    Composable Worst-Case Delay Bound Analysis Using Network Calculus2018In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, no 3, p. 705-709Article in journal (Refereed)
    Abstract [en]

    Performance analysis is playing an indispensable role in design and evaluation for on-chip networks. In former studies, the end-to-end delay bound is calculated by the equivalent service curve method based on network calculus when resource sharing happens. However, in this paper, we propose a composable method to get the bound. This method uses the aggregated local arrival curve to get the local delay bound first, then calculates the end-to-end bound by summing up local bounds. This method solves the scalability problem and largely decreases the computation complexity compared with the former method.

  • 6.
    Lu, Zhonghai
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronics.
    Zhao, Xueqian
    xMAS-Based QoS Analysis Methodology2018In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, no 2, p. 364-377Article in journal (Refereed)
    Abstract [en]

    On-chip communication system design starting from a high-level model can facilitate formal verification of system properties, such as safety and deadlock freedom. Yet, analyzing its quality-of-service (QoS) property, in our context, per-flow delay bound, is an open challenge. Based on executable micro-architectural specification (xMAS) which is a formal framework modeling communication fabrics, we first present how to model a classic input-queuing virtual channel router using the xMAS primitives and then a QoS analysis methodology using network calculus (NC). Thanks to the precise semantics of the xMAS primitives, the router can be modeled in different variants, which cannot be otherwise captured by normal ad hoc box diagrams. The analysis methodology consists of three steps: 1) given network and flow knowledge, we first create a well-defined precise xMAS model for a specific application on a concrete on-chip network; 2) the specific xMAS model is then mapped to an NC graph (NCG) following a set of mapping rules; and 3) finally, existing QoS analysis techniques can be applied to analyze the NCG to obtain end-to-end delay bound per flow. We also show how to apply the technique to a typical all-to-one communication pattern on a binary-tree network and conduct an SoC case study, exemplifying the step-by-step analysis procedure and discussing the tightness of the results.

  • 7.
    Naeem, Abdul
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Scalability Analysis of Memory Consistency Models in NoC-based Distributed Shared Memory SoCs2013In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 32, no 5, p. 760-773Article in journal (Refereed)
    Abstract [en]

    We analyze the scalability of six memory consistency models in network-on-chip (NoC)-based distributed shared memory multicore systems: 1) protected release consistency (PRC); 2) release consistency (RC); 3) weak consistency (WC); 4) partial store ordering (PSO); 5) total store ordering (TSO); and 6) sequential consistency (SC). Their realizations are based on a transaction counter and an address-stack-based approach. The scalability analysis is based on different workloads mapped on various sizes of networks using different problem sizes. For the experiments, we use Nostrum NoC-based configurable multicore platform with a 2-D mesh topology and a deflection routing algorithm. Under the synthetic workloads, the average execution time for the PRC, RC, WC, PSO, and TSO models in the 8 x 8 network (64-cores) is reduced by 32.3%, 28.3%, 20.1%, 13.8%, and 9.9% over the SC model, respectively. For the application workloads, as the network size grows, the average execution time under these relaxed memory models decreases with respect to the SC model depending on the application and its match to the architecture. The performance improvement of the PRC and RC models over the SC model tends to be higher than 50% as observed in the experiments, when the system is further scaled up. The area cost in the network interface for the relaxed memory models is increased by less than 4% over the SC model.

  • 8.
    Pamunuwa, Dinesh
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Elassaad, Shauki
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Modeling or delay and noise in arbitrarily coupled RC trees2005In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 24, no 11, p. 1725-1739Article in journal (Refereed)
    Abstract [en]

    Closed-form equations for second-order transfer functions of general arbitrarily coupled resistance-capacitance (RC) trees with multiple drivers are reported. The models allow precise delay and noise calculations for systems of coupled interconnects with guaranteed stability and represent the minimum complexity associated with this class of circuits. Their accuracy is extensively compared against other relevant models and is found to be better or comparable to more expensive models. All results are derived from a theoretical approach, and their physical basis is examined. The simplicity, accuracy, and generality of the models make them suitable for use in early signal integrity analyses of complex systems and incremental physical optimization.

  • 9. Qian, Yue
    et al.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronic Systems.
    Dou, Wenhua
    Analysis of Worst-Case Delay Bounds for On-Chip Packet-Switching Networks2010In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 29, no 5, p. 802-815Article in journal (Refereed)
    Abstract [en]

    In network-on-chip (NoC), computing worst-case delay bounds for packet delivery is crucial for designing predictable systems but yet an intractable problem. This paper presents an analysis technique to derive per-flow communication delay bound. Based on a network contention model, this technique, which is topology independent, employs network calculus to first compute the equivalent service curve for an individual flow and then calculate its packet delay bound. To exemplify this method, this paper also presents the derivation of a closed-form formula to compute a flow's delay bound under all-to-one gather communication. Experimental results demonstrate that the theoretical bounds are correct and tight.

  • 10.
    Raudvere, Tarvo
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Sander, Ingo
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Jantsch, Axel
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Application and. verification of local nonsemantic-preserving transformations in system-design2008In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 27, no 6, p. 1091-1103Article in journal (Refereed)
    Abstract [en]

    Due to the increasing abstraction gap between the initial system model and a final implementation, the verification of the respective models against each other is a formidable task. This paper addresses the verification problem by proposing a stepwise application of combined refinement and verification activities in the context of synchronous model of computation. An implementation model is developed from the system model by applying pre-defined design transformations which are as follows: 1) semantic preserving or 2) nonsemantic preserving. Nonsemantic-preserving transformations introduce lower level implementation details, which are necessary to yield an efficient implementation. Our approach divides the verification tasks into two activities: 1) the local correctness of a refined block is checked by using formal verification tools and predefined properties, which are developed for each nonsemantic-preserving transformation, and 2) the global influence of the refinement to the entire system is studied through static analysis. We illustrate the design refinement and verification approach with three transformations: 1) a communication refinement mapping a synchronous channel to an asynchronous one including a handshake mechanism; 2) a computation refinement, which introduces resource sharing in a combinational computation block; and 3) a synchronization demanding refinement, where an algorithm analyzes the influence of a local refinement to the temporal properties of the entire system and restores the system's correct temporal behavior if necessary.

  • 11.
    Sander, Ingo
    et al.
    KTH, Superseded Departments, Electronic Systems Design.
    Jantsch, Axel
    KTH, Superseded Departments, Electronic Systems Design.
    System modeling and transformational design refinement in ForSyDe2004In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 23, no 1, p. 17-32Article in journal (Refereed)
    Abstract [en]

    The scope, of the Formal System Design (ForSyDe) methodology is high-level modeling and refinement of systems-on-a-chip and embedded systems. Starting with a formal specification model, that captures the functionality of the system at a high abstraction level, it provides formal design-transformation methods for a transparent refinement process of the system model into an implementation model that is optimized for synthesis. The main contribution of this paper is the ForSyDe modeling technique and the formal treatment of transformational design refinement. We introduce process constructors, that cleanly separate the computation part of a process from the synchronization and communication part. We develop the characteristic function for each process type and use it to define semantic preserving and design decision transformations. These transformations are characterized by name, the format of the original process network, the transformed process network, and a design implication. The implication expresses the relation between original and transformed process network by means of the characteristic function. The objective of the refinement process is a model that can be implemented cost efficiently. To this end, process constructors and processes have a hardware and software interpretation which shall facilitate accurate performance and cost estimations. In a study of a digital equalizer example, we illustrate the modeling and refinement process and focus in particular on refinement of the clock domain, communication refinement, and resource sharing.

  • 12.
    Sou, Kin Cheong
    et al.
    Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
    Daniel, Luca
    Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
    Megretski, Alexandre
    Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
    A Quasi-Convex Optimization Approach to Parameterized Model Order Reduction2008In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 27, no 3, p. 456-469Article in journal (Refereed)
    Abstract [en]

    In this paper, an optimization-based model orderreduction (MOR) framework is proposed. The method involvessetting up a quasi-convex program that solves a relaxation of theoptimal H∞ norm MOR problem. The method can generate guaranteedstable and passive reduced models and is very flexible inimposing additional constraints such as exact matching of specificfrequency response samples. The proposed optimization-basedapproach is also extended to solve the parameterized modelreductionproblem (PMOR). The proposed method is compared to existing moment matching and optimization-basedMOR methodsin several examples. PMOR models for large RF inductors oversubstrate and power-distribution grid are also constructed.

  • 13. Tuuna, S.
    et al.
    Isoaho, J.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Analytical model for crosstalk and intersymbol interference in point-to-point buses2006In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 25, no 7, p. 1400-1411Article in journal (Refereed)
    Abstract [en]

    In this paper, an analytical model to estimate crosstalk noise and intersymbol interference on capacitively and inductively coupled point-to-point on-chip buses is derived. The derived closed-form equation for output voltage enables the usage of the model in computer-aided design (CAD) tools for complex systems where high simulation speed is essential. The model also combines together properties such as inductive coupling, initial conditions, signal rise time, input phases, and bit sequences that have not been included in a single closed-form model before. The model is compared to HSPICE and previous models. The model and HSPICE are in good agreement with each other.

  • 14. Weerasekera, Roshan
    et al.
    Pamunuwa, Dinesh
    Zheng, Li-Rong
    KTH, School of Information and Communication Technology (ICT), Centres, VinnExcellence Center for Intelligence in Paper and Packaging, iPACK. KTH, School of Information and Communication Technology (ICT), Electronic, Computer and Software Systems, ECS.
    Tenhunen, Hannu
    KTH, School of Information and Communication Technology (ICT), Microelectronics and Information Technology, IMIT.
    Two-Dimensional and Three-Dimensional Integration of Heterogeneous Electronic Systems Under Cost, Performance, and Technological Constraints2009In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 28, no 8, p. 1237-1250Article in journal (Refereed)
    Abstract [en]

    Present day market demand for high-performance high-density portable hand-held applications has shifted the focus from 2-D planar system-on-a-chip-type single-chip solutions to alternatives such as tiled silicon and single-level embedded modules as well as 3-D die stacks. Among the various choices, finding an optimal solution for system implementation deals usually with cost, performance, power, thermal, and technological tradeoff analyses at the system conceptual level. It has been estimated that decisions made in the first 20% of the design cycle influence up to 80% of the final product cost. In this paper, we discuss realistic metrics appropriate for performance and cost tradeoff analyses both at the system conceptual level in the early stages of the design cycle and in the implementation phase, for verification. In order to validate the proposed metrics and methodology, two ubiquitous electronic systems are analyzed under various implementation schemes and the performance tradeoffs discussed. This case study is used to highlight the importance of a cost and performance tradeoff analysis early in the design flow.

  • 15.
    Zhao, Xueqian
    et al.
    KTH, School of Information and Communication Technology (ICT), Electronics and Embedded Systems.
    Lu, Zhonghai
    KTH, School of Information and Communication Technology (ICT), Electronics and Embedded Systems.
    Heuristics-Aided Tightness Evaluation of Analytical Bounds in Networks-on-Chip2015In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 34, no 6, p. 986-999, article id 7038204Article in journal (Refereed)
    Abstract [en]

    Studying the tightness of analytical delay and backlog bounds is critical for network-on-chip designs, since formal analysis predicts the boundary of communication delay and buffer dimensioning. However, this evaluation process is often a tedious, time-consuming, and manual simulation process whereas many simulation parameters have to be configured before the simulations run. We formulate the tightness evaluation as constrained optimization problems for delay bound and backlog bounds, respectively. The well-defined problems enable a fully automated configuration searching process, which can be guided by a heuristic algorithm with cycle-accurate simulations integrated. This is a fully automated procedure and thus provides a promising path to automatic design space exploration in similar contexts. Experimental results over various topologies and traffic patterns indicate that our method is effective in finding the configuration for best tightness up to 98%, even when up to 50 parameters are configured in a multidimensional discrete search space under complex constraints.

1 - 15 of 15
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf