Change search
Refine search result
1 - 40 of 40
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Becker, Matthias
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Chen, DeJiu
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Machine Design (Div.). KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Towards QoS-Aware Service-Oriented Communication in E/E Automotive Architectures2018In: Proceedings of the 44th Annual Conference of the IEEE Industrial Electronics Society (IECON), Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 4096-4101, article id 8591521Conference paper (Refereed)
    Abstract [en]

    With the raise of increasingly advanced driving assistance systems in modern cars, execution platforms that build on the principle of service-oriented architectures are being proposed. Alongside, service oriented communication is used to provide the required adaptive communication infrastructure on top of automotive Ethernet networks. A middleware is proposed that enables QoS aware service-oriented communication between software components, where the prescribed behavior of each software component is defined by Assume/Guarantee (A-G) contracts. To enable the use of COTS components, that are often not sufficiently verified for the use in automotive systems, the middleware monitors the communication behavior of components and verifies it against the components A/G contract. A violation of the allowed communication behavior then triggers adaption processes in the system while the impact on other communication is minimized. The applicability of the approach is demonstrated by a case study that utilizes a prototype implementation of the proposed approach.

  • 2.
    Becker, Matthias
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Mubeen, Saad
    Mälardalen University.
    Timing Analysis Driven Design-Space Exploration of Cause-Effect Chains in Automotive Systems2018In: IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society, 2018Conference paper (Refereed)
    Abstract [en]

    Model-based development and component-based software engineering have emerged as a promising approach to deal with enormous software complexity in automotive systems. This approach supports the development of software architectures by interconnecting (and reusing) software components (SWCs) at various abstraction levels. Automotive software architectures are often modeled with chains of SWCs, also called cause-effect chains that are constrained by timing requirements. Based on the variations in activation patterns of SWCs, a single model of a cause-effect chain at a higher abstraction level can conform to several valid refined models of the chain at a lower abstraction level, which is closer to the system implementation. As a consequence, the total number of valid implementation-level models generated by the existing techniques increases exponentially, thereby significantly increasing the runtime of the timing analysis engines and liming the scalability of the existing techniques. This paper computes an upper bound on the activation pattern combinations that may result from a system of cause-effect chains in a given high-level model of the software architecture. An efficient algorithm is presented that traverses only a reduced number of possible combinations of the cause-effect chains, resulting in the timing analysis of a significantly lower number of implementation-level models of the software architecture. A proof of concept is provided by conducting a case study that shows significant reduction in the runtime of timing analysis engines, i.e., the timing behavior of the considered system is verified by performing the timing analysis of only 27% of all possible combinations of the cause-effect chains.

  • 3. Ben Dhaou, I.
    et al.
    Kondoro, Aron
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics. University of Dar es Salaam, Tanzania.
    Kelati, Amleset
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems. University of Turku, Finland.
    Rwegasira, Diana
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics. University of Turku, Finland.
    Naiman, S.
    Mvungi, N. H.
    Tenhunen, Hannu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.
    Communication and security technologies for smart grid2018In: Fog Computing: Breakthroughs in Research and Practice, IGI Global , 2018, p. 305-331Chapter in book (Other academic)
    Abstract [en]

    The smart grid is a new paradigm that aims to modernize the legacy power grid. It is based on the integration of ICT technologies, embedded system, sensors, renewable energy and advanced algorithms for management and optimization. The smart grid is a system of systems in which communication technology plays a vital role. Safe operations of the smart grid need a careful design of the communication protocols, cryptographic schemes, and computing technology. In this article, the authors describe current communication technologies, recently proposed algorithms, protocols, and architectures for securing smart grid communication network. They analyzed in a unifying approach the three principles pillars of smart-gird: Sensors, communication technologies, and security. Finally, the authors elaborate open issues in the smart-grid communication network.

  • 4. Charif, Amir
    et al.
    Coelho, Alexandre
    Ebrahimi, Masoumeh
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems. KTH.
    Bagherzadeh, Nader
    Zergainoh, Nacer-Eddine
    First-Last: A Cost-Effective Adaptive Routing Solution for TSV-Based Three-Dimensional Networks-on-Chip2018In: IEEE transactions on computer, ISSN 0018-9340, Vol. 67, no 10, p. 1430-1444Article in journal (Refereed)
    Abstract [en]

    3D integration opens up new opportunities for future multiprocessor chips by enabling fast and highly scalable 3DNetwork-on-Chip (NoC) topologies. However, in an aim to reduce the cost of Through-silicon via (TSV), partially vertically connectedNoCs, in which only a few vertical TSV links are available, have been gaining relevance. To reliably route packets under suchconditions, we introduce a lightweight, efficient and highly resilient adaptive routing algorithm targeting partially vertically connected3D-NoCs named First-Last. It requires a very low number of virtual channels (VCs) to achieve deadlock-freedom (2 VCs in the Eastand North directions and 1 VC in all other directions), and guarantees packet delivery as long as one healthy TSV connecting all layersis available anywhere in the network. An improved version of our algorithm, named Enhanced-First-Last is also introduced and shownto dramatically improve performance under low TSV availability while still using less virtual channels than state-of-the-art algorithms. Acomprehensive evaluation of the cost and performance of our algorithms is performed to demonstrate their merits with respects toexisting solutions.

  • 5.
    Chen, DeJiu
    et al.
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems. KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Machine Design (Div.). KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Östberg, Kenneth
    RISE - Research Institutes of Sweden.
    Becker, Matthias
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Sivencrona, Håkan
    Zenuity AB.
    Warg, Fredrik
    RISE - Research Institutes of Sweden.
    Design of a Knowledge-Base Strategy for Capability-Aware Treatment of Uncertainties of Automated Driving Systems2018In: Computer Safety, Reliability, and Security. / [ed] Gallina B., Skavhaug A., Schoitsch E., Bitsch F., Cham, 2018, Vol. 11094Conference paper (Refereed)
    Abstract [en]

    Automated Driving Systems (ADS) represent a key technological advancement in the area of Cyber-physical systems (CPS) and Embedded Control Systems (ECS) with the aim of promoting traffic safety and environmental sustainability. The operation of ADS however exhibits several uncertainties that if improperly treated in development and operation would lead to safety and performance related problems. This paper presents the design of a knowledge-base (KB) strategy for a systematic treatment of such uncertainties and their system-wide implications on design-space and state-space. In the context of this approach, we use the term Knowledge-Base (KB) to refer to the model that stipulates the fundamental facts of a CPS in regard to the overall system operational states, action sequences, as well as the related costs or constraint factors. The model constitutes a formal basis for describing, communicating and inferring particular operational truths as well as the belief and knowledge representing the awareness or comprehension of such truths. For the reasoning of ADS behaviors and safety risks, each system operational state is explicitly formulated as a conjunction of environmental state and some collective states showing the ADS capabilities for perception, control and actuations. Uncertainty Models (UM) are associated as attributes to such state definitions for describing and quantifying the corresponding belief or knowledge status due to the presences of evidences about system performance and deficiencies, etc. On a broader perspective, the approach is part of our research on bridging the gaps among intelligent functions, system capability and dependability for mission-&safety-critical CPS, through a combination of development- and run-time measures.

  • 6. Chen, Kun-Chih (Jimmy)
    et al.
    Ebrahimi, Masoumeh
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Wang, Ting-Yi
    Yang, Yuch-Chi
    NoC-based DNN Accelerator: A Future Design Paradigm2019Conference paper (Refereed)
    Abstract [en]

    Deep Neural Networks (DNN) have shown significant advantagesin many domains such as pattern recognition, prediction, and controloptimization. The edge computing demand in the Internet-of-Things era has motivated many kinds of computing platforms toaccelerate the DNN operations. The most common platforms areCPU, GPU, ASIC, and FPGA. However, these platforms suffer fromlow performance (i.e., CPU and GPU), large power consumption(i.e., CPU, GPU, ASIC, and FPGA), or low computational flexibilityat runtime (i.e., FPGA and ASIC). In this paper, we suggest theNoC-based DNN platform as a new accelerator design paradigm.The NoC-based designs can reduce the off-chip memory accessesthrough a flexible interconnect that facilitates data exchange betweenprocessing elements on the chip. We first comprehensivelyinvestigate conventional platforms and methodologies used in DNNcomputing. Then we study and analyze different design parametersto implement the NoC-based DNN accelerator. The presentedaccelerator is based on mesh topology, neuron clustering, randommapping, and XY-routing. The experimental results on LeNet, MobileNet,and VGG-16 models show the benefits of the NoC-basedDNN accelerator in reducing off-chip memory accesses and improvingruntime computational flexibility.

  • 7.
    Chen, Xiaowen
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Efficient Memory Access and Synchronization in NoC-based Many-core Processors2019Doctoral thesis, monograph (Other academic)
    Abstract [en]

    In NoC-based many-core processors, memory subsystem and synchronization mechanism are always the two important design aspects, since mining parallelism and pursuing higher performance require not only optimized memory management but also efficient synchronization mechanism. Therefore, we are motivated to research on efficient memory access and synchronization in three topics, namely, efficient on-chip memory organization, fair shared memory access, and efficient many-core synchronization.

    One major way of optimizing the memory performance is constructing a suitable and efficient memory organization. A distributed memory organization is more suitable to NoC-based many-core processors, since it features good scalability. We envision that it is essential to support Distributed Shared Memory (DSM) because of the huge amount of legacy code and easy programming. Therefore, we first adopt the microcoded approach to address DSM issues, aiming for hardware performance but maintaining the flexibility of programs. Second, we further optimize the DSM performance by reducing the virtual-to-physical address translation overhead. In addition to the general-purpose memory organization such as DSM, there exists special-purpose memory organization to optimize the performance of application-specific memory access. We choose Fast Fourier Transform (FFT) as the target application, and propose a multi-bank data memory specialized for FFT computation.

    In 3D NoC-based many-core processors, because processor cores and memories reside in different locations (center, corner, edge, etc.) of different layers, memory accesses behave differently due to their different communication distances. As the network size increases, the communication distance difference of memory accesses becomes larger, resulting in unfair memory access performance among different processor cores. This unfair memory access phenomenon may lead to high latencies of some memory accesses, thus negatively affecting the overall system performance. Therefore, we are motivated to study on-chip memory and DRAM access fairness in 3D NoC-based many-core processors through narrowing the round-trip latency difference of memory accesses as well as reducing the maximum memory access latency.

    Barrier synchronization is used to synchronize the execution of parallel processor cores. Conventional barrier synchronization approaches such as master-slave, all-to-all, tree-based, and butterfly are algorithm oriented. As many processor cores are networked on a single chip, contended synchronization requests may cause large performance penalty. Motivated by this, different from the algorithm-based approaches, we choose another direction (i.e., exploiting efficient communication) to address the barrier synchronization problem. We propose cooperative communication as a means and combine it with the master-slave algorithm and the all-to-all algorithm to achieve efficient many-core barrier synchronization. Besides, a multi-FPGA implementation case study of fast many-core barrier synchronization is conducted.

  • 8.
    Chen, Xiaowen
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS). Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China.
    Lei, Yuanwu
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Chen, Shuming
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    A Variable-Size FFT Hardware Accelerator Based on Matrix Transposition2018In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 26, no 10, p. 1953-1966Article in journal (Refereed)
    Abstract [en]

    Fast Fourier transform (FFT) is the kernel and the most time-consuming algorithm in the domain of digital signal processing, and the FFT sizes of different applications are very different. Therefore, this paper proposes a variable-size FFT hardware accelerator, which fully supports the IEEE-754 single-precision floating-point standard and the FFT calculation with a wide size range from 2 to 220 points. First, a parallel Cooley-Tukey FFT algorithm based on matrix transposition (MT) is proposed, which can efficiently divide a large size FFT into several small size FFTs that can be executed in parallel. Second, guided by this algorithm, the FFT hardware accelerator is designed, and several FFT performance optimization techniques such as hybrid twiddle factor generation, multibank data memory, block MT, and token-based task scheduling are proposed. Third, its VLSI implementation is detailed, showing that it can work at 1 GHz with the area of 2.4 mm(2) and the power consumption of 91.3 mW at 25 degrees C, 0.9 V. Finally, several experiments are carried out to evaluate the proposal's performance in terms of FFT execution time, resource utilization, and power consumption. Comparative experiments show that our FFT hardware accelerator achieves at most 18.89x speedups in comparison to two software-only solutions and two hardware-dedicated solutions.

  • 9.
    Chen, Yancang
    et al.
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Xie, Lunguo
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Li, Jinwen
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Shi, Zhu
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Zhang, Minxuan
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Chen, Xiaowen
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    A Trace-driven Hardware-level Simulator for Design and Verification of Network-on-Chips2010In: 2011 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND AUTOMATION (CCCA 2011), VOL II / [ed] Thaung, K S, IEEE , 2010, p. 32-35Conference paper (Refereed)
    Abstract [en]

    Traditional communications of general-purpose multi-core processor and application-specific System-on-Chip face challenges in terms of scalability and complexity. Network-on-Chip (NoC) has been the most promising solution for the communications of multi-core and many-core chips. In this paper, we present a trace-driven hardware-level simulator (noted HS) based on SystemVerilog for the design and verification of NoCs. Different from the state-of-the-art NoC simulators, the HS owns three important characteristics in addition to the capability of creating simulation and synthesizable NoC descriptions: 1) hardware-level simulation can be done, which means more implementation details of hardware than flit-level simulation; 2) router debugging and verification can be done at RTL by inserting assertions and coverage; 3) trace-based application simulations can be done besides synthetic workloads. A 4 X 4 2D mesh NoC with output virtual-channel routers verifies the capability of our HS.

  • 10.
    Chen, Zhe
    et al.
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Guo, Shize
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Wang, Jian
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Li, Yubai
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Toward FPGA Security in IoT: A New Detection Technique for Hardware Trojans2019In: IEEE Internet of Things Journal, ISSN 2327-4662, Vol. 6, no 4, p. 7061-7068Article in journal (Refereed)
    Abstract [en]

    Nowadays, field programmable gate array (FPGA) has been widely used in Internet of Things (IoT) since it can provide flexible and scalable solutions to various IoT requirements. Meanwhile, hardware Trojan (HT), which may lead to undesired chip function or leak sensitive information, has become a great challenge for FPGA security. Therefore, distinguishing the Trojan-infected FPGAs is quite crucial for reinforcing the security of IoT. To achieve this goal, we propose a clock-tree-concerned technique to detect the HTs on FPGA. First, we present an experimental framework which helps us to collect the electromagnetic (EM) radiation emitted by FPGA clock tree. Then, we propose a Trojan identifying approach which extracts the mathematical feature of obtained EM traces, i.e., 2-D principal component analysis (2DPCA) in this paper, and automatically isolates the Trojan-infected FPGAs from the Trojan-free ones by using a BP neural network. Finally, we perform extensive experiments to evaluate the effectiveness of our method. The results reveal that our approach is valid in detecting HTs on FPGA. Specifically, for the trust-hub benchmarks, we can find out the FPGA with always on Trojans (100% detection rate) while identifying the triggered Trojans with high probability (by up to 92%). In addition, we give a thorough discussion on how the experimental setup, such as probe step size, scanning area, and chip ambient temperature, affects the Trojan detection rate.

  • 11.
    Du, Gaoming
    et al.
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Liu, Guanyu
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Li, Zhenmin
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Cao, Yifan
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Zhang, Duoli
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Ouyang, Yiming
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Gao, Minglun
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    SSS: Self-aware System-on-chip Using a Static-dynamic Hybrid Method2019In: ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, ISSN 1550-4832, Vol. 15, no 3, article id 28Article in journal (Refereed)
    Abstract [en]

    Network-on-Chip (NoC) has become the de facto communication standard for multi-core or many-core System-on-Chip (SoC) due to its scalability and flexibility. However, an important factor in NoC design is temperature, which affects the overall performance of SoC-decreasing circuit frequency, increasing energy consumption, and even shortening chip lifetime. In this article, we propose SSS, a self-aware SoC using a static-dynamic hybrid method that combines dynamic mapping and static mapping to reduce the hotspot temperature for NoC-based SoCs. First, we propose monitoring and thermal modeling for self-state sensoring. Then, in static mapping stage, we calculate the optimal mapping solutions under different temperature modes using the discrete firefly algorithm to help self-decisionmaking. Finally, in dynamic mapping stage, we achieve dynamic mapping through configuring NoC and SoC sentient units for self-optimizing. Experimental results show that SSS has substantially reduced the peak temperature by up to 37.52%. The FPGA prototype proves the effectiveness and smartness of SSS in reducing hotspot temperature.

  • 12.
    Dubrova, Elena
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    A reconfigurable arbiter PUF with 4 x 4 switch blocks2018In: Proceedings of The International Symposium on Multiple-Valued Logic, IEEE Computer Society , 2018, p. 31-37Conference paper (Refereed)
    Abstract [en]

    Physical Unclonable Functions (PUFs) exploit manufacturing process variation to create responses that are unique to individual integrated circuits (ICs). Typically responses of a PUF cannot be modified once the PUF is fabricated. In applications which use PUFs as a long-Term secret key, it would be useful to have a simple mechanism for reconfiguring the PUF in order to update the key periodically. In this paper, we present a new type of arbiter PUFs which use 4 x 4 switch blocks instead of the conventional 2 x 2 ones. Each 4 x 4 switch block can be reconfigured in many different ways during the PUF's lifetime, making possible regular key updates. © 2018 IEEE.

  • 13.
    Dubrova, Elena
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Selander, G.
    Näslund, Mats
    KTH.
    Lindqvist, Fredrik
    KTH.
    Lightweight message authentication for constrained devices2018In: WiSec 2018 - Proceedings of the 11th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Association for Computing Machinery (ACM), 2018, p. 196-201Conference paper (Refereed)
    Abstract [en]

    Message Authentication Codes (MACs) used in today's wireless communication standards may not be able to satisfy resource limitations of simpler 5G radio types and use cases such as machine type communications. As a possible solution, we present a lightweight message authentication scheme based on the cyclic redundancy check (CRC). It has been previously shown that a CRC with an irreducible generator polynomial as the key is an -almost XOR-universal (AXU) hash function with = (m + n)/2n-1, where m is the message size and n is the CRC size. While the computation of n-bit CRCs can be efficiently implemented in hardware using linear feedback shift registers, generating random degree-n irreducible polynomials is computationally expensive for large n. We propose using a product of k irreducible polynomials whose degrees sum up to n as a generator polynomial for an n-bit CRC and show that the resulting hash functions are -AXU with = (m + n)k/2n -k. The presented message authentication scheme can be seen as providing a trade-off between security and implementation efficiency.

  • 14.
    Dubrova, Elena
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Teslenko, Maxim
    An efficient SAT-based algorithm for finding short cycles in cryptographic algorithms2018In: Proceedings of the 2018 IEEE International Symposium on Hardware Oriented Security and Trust, HOST 2018, Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 65-72Conference paper (Refereed)
    Abstract [en]

    The absence of short cycles is a desirable property for cryptographic algorithms that are iterated. Furthermore, as demonstrated by the cryptanalysis of A5, short cycles can be exploited to reduce the complexity of an attack. We present an algorithm which uses a SAT-based bounded model checking for finding all short cycles of a given length. The existing Boolean Decision Diagram (BDD) based algorithms for finding cycles have limited capacity due to the excessive memory requirements of BDDs. The simulation-based algorithms can be applied to larger problem instances, however, they cannot guarantee the detection of all cycles of a given length. The same holds for general-purpose SAT-based model checkers. The presented algorithm can handle cryptographic algorithms with very large state spaces, including important ciphers such as Trivium and Grain-128. We found that these ciphers contain short cycles whose existence, to our best knowledge, was previously unknown. This potentially opens new possibilities for cryptanalysis.

  • 15.
    Ebrahimi, Masoumeh
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Kelati, Amleset
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Nkonoki, Emma
    Kondoro, Aron
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems.
    Rwegasira, Diana
    KTH.
    Ben Dhaou, Imed
    Taajamaa, Ville
    Tenhunen, Hannu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Integrated devices and circuits.
    Creation of CERID: Challenge, Education, Research, Innovation, and Deployment in the context of smart MicroGrid2019In: IST-Africa 2019 Conference Proceedings / [ed] Paul Cunningham ; Miriam Cunningham, 2019Conference paper (Refereed)
    Abstract [en]

    The iGrid project deals with the design and implementation of a solar-powered smart microgrid to supply electric power to small rural communities. In this paper, we discuss the roadmap of the iGrid project, which forms by merging the roadmaps of KIC (knowledge and Innovation Community) and CDE (Challenge-Driven Education). We introduce and explain a five-gear chain as Challenge, Education, Research, Innovation, and Deployment, called CERID, to reach the main goals of this project. We investigate the full chain in the iGrid project, which is established between KTH Royal Institute of Technology (Sweden) and University of Dar es Salam (Tanzania). We introduce the key stakeholders and explain how CERID goals can be accomplished in higher educations and through scientific research. Challenges are discussed, some innovative ideas are introduced and deployment solutions are recommended.

  • 16. Fakih, M.
    et al.
    Grüttner, K.
    Schreiner, S.
    Seyyedi, R.
    Azkarate-Askasua, M.
    Onaindia, P.
    Poggi, T.
    Romero, N. G.
    Gonzalez, E. Q.
    Sundström, T.
    Frasquet, S. P.
    Balbastre, P.
    Mohammadat, T.
    Öberg, Johnny
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Bebawy, Y.
    Obermaisser, R.
    Maleki, A.
    Lenz, A.
    Graham, D.
    Experimental evaluation of SAFEPOWER architecture for safe and power-efficient mixed-criticality systems2019In: Journal of Low Power Electronics and Applications, Vol. 9, no 1, article id 12Article in journal (Refereed)
    Abstract [en]

    With the ever-increasing industrial demand for bigger, faster and more efficient systems, a growing number of cores is integrated on a single chip. Additionally, their performance is further maximized by simultaneously executing as many processes as possible. Even in safety-critical domains like railway and avionics, multicore processors are introduced, but under strict certification regulations. As the number of cores is continuously expanding, the importance of cost-effectiveness grows. One way to increase the cost-efficiency of such a System on Chip (SoC) is to enhance the way the SoC handles its power consumption. By increasing the power efficiency, the reliability of the SoC is raised because the lifetime of the battery lengthens. Secondly, by having less energy consumed, the emitted heat is reduced in the SoC, which translates into fewer cooling devices. Though energy efficiency has been thoroughly researched, there is no application of those power-saving methods in safety-critical domains yet. The EU project SAFEPOWER (Safe and secure mixed-criticality systems with low power requirements) targets this research gap and aims to introduce certifiable methods to improve the power efficiency of mixed-criticality systems. This article provides an overview of the SAFEPOWER reference architecture for low-power mixed-criticality systems, which is the most important outcome of the project. Furthermore, the application of this reference architecture in novel railway interlocking and flight controller avionic systems was demonstrated, showing the capability to achieve power savings up to 37%, while still guaranteeing time-triggered task execution and time-triggered NoC-based communication. 

  • 17.
    Jiang, Shuyan
    et al.
    University of Electronic Science and Technology of China, Chengdu, China.
    Wu, Qiong
    University of Electronic Science and Technology of China, Chengdu, China.
    Chen, Shuyu
    University of Electronic Science and Technology of China, Chengdu, China.
    Zhan, Junkai
    University of Electronic Science and Technology of China, Chengdu, China.
    Wang, Junshi
    University of Electronic Science and Technology of China, Chengdu, China.
    Ebrahimi, Masoumeh
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Huang, Letian
    University of Electronic Science and Technology of China, Chengdu, China.
    Testing aware dynamic mapping for path-centric network-on-chip test2019In: Integration, ISSN 0167-9260, E-ISSN 1872-7522, Vol. 67, p. 134-143Article in journal (Refereed)
    Abstract [en]

    With the aggressive scaling of submicron technology, intermittent faults are becoming one of the limiting factors in achieving high reliability in Network-on-Chip (NoC). Increasing test frequency is necessary to detect intermittent faults, which in turn interrupts the execution of applications. On the other hand, the primary goal of traditional mapping algorithms is to allocate applications to the NoC platform, ignoring the test requirement. In this paper, we propose a novel testing-aware mapping algorithm (TAMA) for NoC, targeting intermittent faults on the paths between crossbars. In this approach, the idle paths are identified, and the components between two crossbars are tested when the application is mapped to the platform. The components can be tested if there is enough time from the time when the application leaves the platform to the time when a new application enters it. The mapping algorithm is tuned to give a higher priority to the tested paths in the next application mapping, which leaves enough time to test the links and the belonging components that have not been tested in the expected time. Experiment results show that the proposed testing-aware mapping algorithm leads to a significant improvement over FF(Fiexitrst Free), NN(Nearest Neighbor), CoNA(Contiguous Neighborhood Allocation), and WeNA(Weighted-based Neighborhood Allocation).

  • 18.
    Kelati, Amleset
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems. KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Integrated devices and circuits. University of Turku, Finland.
    Nigussie, Ethiopia
    University of Turku, Finland.
    Plosila, Juha
    University of Turku, Finland.
    Tenhunen, Hannu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Integrated devices and circuits. University of Turku, Finland.
    Biosignal Feature Extraction Techniques for IoT Healthcare Platform2016In: IEEE Conference on Design and Architectures for Signal and Image Processing (DASIP2016), Rennes, France, 2016Conference paper (Other (popular science, discussion, etc.))
    Abstract [en]

    In IoT healthcare platform, a variety of biosignals are acquired from its sensors and appropriate feature extraction techniques are crucial in order to make use of the acquired biosignal data and help the healthcare scientist or bio-engineer to reach at optimal decisions. This work reviews the existing biosignal feature extraction and classification methods for different healthcare applications. Due the enormous amount of different biosignals and since most healthcare applications uses electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), Electrogastrogram (EGG), we focus the review on feature extractions and classification method for these biosignals. The review also includes a summary of Blood Oxygen Saturation determined by Pulse Oximetry (SpO2), Electrooculography and eye movement (EOG), and Respiration (RSP) signals. Its discussion and analysis focuses on advantages, performance and drawbacks of the techniques.

  • 19.
    Kelati, Amleset
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems. University of Turku (UTU), Turku, Finland.
    Plosila, Juha
    University of Turku (UTU), Turku, Finland.
    Tenhunen, Hannu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems. University of Turku (UTU), Turku, Finland.
    Smart Meter Load Profiling for e-Health Monitoring System2019In: 2019 IEEE 7th International Conference on Smart Energy Grid Engineering (SEGE), Oshawa, ON, Canada: IEEE, 2019, , p. 6Conference paper (Refereed)
    Abstract [en]

    A structural health-monitoring system needed to come out from the problem associated due to the rapidly growing population of elderly and the health care demand. The paper discussed the consumer's electricity usage data, from the smart meter, how to support the healthcare sector by load profiling the normal or abnormal energy consumption. For this work, the measured dataset is taken from 12 households and collected by the smart meter with an interval of an hour for one month. The dataset is grouped according to the features pattern, reduced by matrix-based analysis and classified with K-Means algorithm data mining clustering method. We showed how the clustering result of the Sum Square Error (SSE) has connection trend to indicate normal or abnormal behavior of electricity usage and leads to determine the assumption of the consumer's health status.

  • 20.
    Klaus, Tobias
    et al.
    Friedrich Alexander Univ Erlangen Nurnberg FAU, Distributed Syst & Operating Syst, Erlangen, Germany..
    Franzmann, Florian
    Friedrich Alexander Univ Erlangen Nurnberg FAU, Distributed Syst & Operating Syst, Erlangen, Germany..
    Becker, Matthias
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Ulbrich, Peter
    Friedrich Alexander Univ Erlangen Nurnberg FAU, Distributed Syst & Operating Syst, Erlangen, Germany..
    Data Propagation Delay Constraints in Multi-Rate Systems - Deadlines vs. Job-Level Dependencies2018In: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON REAL-TIME NETWORKS AND SYSTEMS (RTNS 2018), ASSOC COMPUTING MACHINERY , 2018Conference paper (Refereed)
    Abstract [en]

    Many industrial areas are faced with a continuous increase in system complexity, while systems need to satisfy stringent timing requirements, which are traditionally based on the tasks' local deadlines. However, correct functionality is subject to high-level timing requirements on data propagation through a set of semantically related tasks. Since distributed concurrent engineering is often used to deal with the complexity of such systems, violations of data propagation delay constraints are only visible at late development stages, where changes in system design become increasingly expensive. In this paper, we leverage job-level dependencies (JLDs) that can be specified at early development stages to guarantee data propagation delay constraints. Therefore, we present an approach that extends the Real-Time Systems Compiler to enforce the JLDs in actual multicore schedules. This strategy enables us to perform extensive evaluations of the effectiveness of JLDs in combination with contemporary allocation and scheduling algorithms, where we observed schedulability improvements of up to 42%. Additionally, we identified the effect of the number of available cores on the data age.

  • 21.
    Kokhazadeh, M.
    et al.
    KN Toosi Univ Technol, Dept Comp Engn, Tehran 1631714191, Iran..
    Kokhazad, Z.
    Sharif Branch, Acad Ctr Educ Culture & Res, Dept IT, Tehran 141554364, Iran..
    Dehyadegari, M.
    KN Toosi Univ Technol, Dept Comp Engn, Tehran 1631714191, Iran..
    Daneshtalab, Masoud
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    A novel two-step method for stereo vision to reduce search space2018In: 26TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2018), IEEE , 2018, p. 1681-1686Conference paper (Refereed)
    Abstract [en]

    Stereo vision is a crucial algorithm in depth detection. By comparing images of a scene from two points, the relative position of objects is extracted. Human's vision system uses this relative shift between the left and right eyes to estimate the depth of information. The main goal of stereo vision is to determine the distance between objects in the scene or, in other words, to obtain depth information. This paper presents a two-step method to reduce the runtime and maintain accuracy of the stereo vision algorithm. Due to the data dependency, its implementation in parallel reduces performance. We have implemented this method for the different values of maximum disparity and window sizes. The simulation result shows that the proposed method is more than 6X faster than the common stereo vision. We have also implemented this method using Compute Unified Device Architecture (CUDA) on a Graphics Processing Unit (GPU), and we have shown that due to data dependency, this method does not work well on the Graphics Processing Unit.

  • 22.
    Li, Pu
    et al.
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China.;Bangor Univ, Sch Elect Engn, Bangor LL57 1UT, Gwynedd, Wales.;Inst Southwestern Commun, Sci & Technol Commun Lab, Chengdu 610041, Sichuan, Peoples R China..
    Guo, Ya
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Guo, Yanqiang
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Fan, Yuanlong
    Bangor Univ, Sch Elect Engn, Bangor LL57 1UT, Gwynedd, Wales..
    Guo, Xiaomin
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Liu, Xianglian
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Shore, K. Alan
    Bangor Univ, Sch Elect Engn, Bangor LL57 1UT, Gwynedd, Wales..
    Dubrova, Elena
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Xu, Bingjie
    Inst Southwestern Commun, Sci & Technol Commun Lab, Chengdu 610041, Sichuan, Peoples R China..
    Wang, Yuncai
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Wang, Anbang
    Taiyuan Univ Technol, Minist Educ, Key Lab Adv Transducers & Intelligent Control Sys, Taiyuan 030024, Shanxi, Peoples R China.;Taiyuan Univ Technol, Coll Phys & Optoelect, Inst Optoelect Engn, Taiyuan 030024, Shanxi, Peoples R China..
    Self-balanced real-time photonic scheme for ultrafast random number generation2018In: APL PHOTONICS, ISSN 2378-0967, Vol. 3, no 6, article id 061301Article in journal (Refereed)
    Abstract [en]

    We propose a real-time self-balanced photonic method for extracting ultrafast random numbers from broadband randomness sources. In place of electronic analog-to-digital converters (ADCs), the balanced photo-detection technology is used to directly quantize optically sampled chaotic pulses into a continuous random number stream. Benefitting from ultrafast photo-detection, our method can efficiently eliminate the generation rate bottleneck from electronic ADCs which are required in nearly all the available fast physical random number generators. A proof-of-principle experiment demonstrates that using our approach 10 Gb/s real-time and statistically unbiased random numbers are successfully extracted from a bandwidth-enhanced chaotic source. The generation rate achieved experimentally here is being limited by the bandwidth of the chaotic source. The method described has the potential to attain a real-time rate of 100 Gb/s.

  • 23.
    Lu, Zhonghai
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Vangal, S.
    Xu, J.
    Bogdan, P.
    Message from the Chairs2018In: 12th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2018; Torino; Italy; 4 October 2018 through 5 October 2018, Institute of Electrical and Electronics Engineers Inc. , 2018, article id 8512149Conference paper (Refereed)
  • 24.
    Lu, Zhonghai
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Yao, Yuan
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Thread Voting DVFS for Manycore NoCs2018In: I.E.E.E. transactions on computers (Print), ISSN 0018-9340, E-ISSN 1557-9956, Vol. 67, no 10, p. 1506-1524, article id 8338086Article in journal (Refereed)
    Abstract [en]

    We present a thread-voting DVFS technique for manycore networks-on-chip (NoCs). This technique has two remarkable features which differentiate from conventional NoC DVFS schemes. (1) Not only network-level but also thread-level runtime performance indicatives are used to guide DVFS decisions. (2) To resolve multiple perhaps conflicting performance indicatives from many cores, it allows each thread to 'vote' for a V/F level in its own performance interest, and a region-based V/F controller makes dynamic per-region V/F decision according to the major vote. We evaluate our technique on a 64-core CMP in full-system simulation environment GEM5 with both PARSEC and SPEC OMP2012 benchmarks. Compared to a network metric (router buffer occupancy) based approach, it can improve the network energy efficacy measured in MPPJ (million packets per joule) by up to 22 percent for PARSEC and 20 percent for SPEC OMP2012, and the system energy efficacy measured in MIPJ (million instructions per joule) by up to 35 percent for PARSEC and 33 percent for SPEC OMP2012. 

  • 25. Lv, Hao
    et al.
    Zhou, You
    Wu, Fei
    Xiao, Weijun
    He, Xubin
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Xie, Changsheng
    Exploiting Minipage-level Mapping to Improve Write Efficiency of NAND Flash2018In: 2018 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, ARCHITECTURE AND STORAGE (NAS), Huazhong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 51800, Peoples R China. [Lv, Hao; Zhou, You; Wu, Fei; Xie, Changsheng] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China. [Wu, Fei; Xie, Changsheng] Minist Educ, Key Lab Data Storage Syst, Wuhan 430074, Hubei, Peoples R China. [Xiao, Weijun] Virginia Commonwealth Univ, Dept Elect & Comp Engn, Richmond, VA 23284 USA. [He, Xubin] Temple Univ, Coll Sci & Technol, Philadelphia, PA 19122 USA. [Lu, Zhonghai] KTH Royal Inst Technol, Sch Informat & Commun Technol, S-10044 Stockholm, Sweden.: IEEE , 2018Conference paper (Refereed)
    Abstract [en]

    Pushing NAND flash memory to higher density, manufacturers are aggressively enlarging the flash page size. However, the sizes of I/O requests in a wide range of scenarios do not grow accordingly. Since a page is the unit of flash read/write operations, traditional flash translation layers (FTLs) maintain the page mapping regularity. Hence, small random write requests become common, leading to extensive partial logical page writes. This write inefficiency significantly degrades the performance and increases the write amplification of flash storage. In this paper, we first propose a configurable mapping layer, called minipage, whose size is set to match I/O request sizes. The minipage-level mapping provides better flexibility in handling small writes at the cost of sequential read performance degradation and a larger mapping table. Then, we propose a new FTL, called PM-FTL, that exploits the minipage-level mapping to improve write efficiency and utilizes the page-level mapping to reduce the costs caused by the minipage-level mapping. Finally, trace-driven simulation results show that compared to traditional FTLs, PM-FTL reduces the write amplification and flash storage response time by an average of 33.4% and 19.1%, up to 57.7% and 34%, respectively, under 16KB flash pages and 4KB minipages.

  • 26.
    Ma, Ruixiang
    et al.
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China..
    Wu, Fei
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China.;Huazhong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 518000, Peoples R China..
    Zhang, Meng
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Wan, Jiguang
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China..
    Xie, Changsheng
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan 430074, Hubei, Peoples R China.;Huazhong Univ Sci & Technol, Shenzhen Res Inst, Shenzhen 518000, Peoples R China..
    RBER-Aware Lifetime Prediction Scheme for 3D-TLC NAND Flash Memory2019In: IEEE Access, E-ISSN 2169-3536, Vol. 7, p. 44696-44708Article in journal (Refereed)
    Abstract [en]

    NAND flash memory is widely used in various computing systems. However, flash blocks can sustain only a limited number of program/erase (P/E) cycles, which are referred to as the endurance. On one hand, in order to ensure data integrity, flash manufacturers often define the maximum P/E cycles of the worst block as the endurance of flash blocks. On the other hand, blocks exhibit large endurance variations, which introduce two serious problems. The first problem is that the error correcting code (ECC) is often over-provisioned, as it has to be designed to tolerate the worst case to ensure data integrity, which causes longer decoding latency. The second problem is the underutilized block's lifespan due to conservatively defined block endurance. Raw bit error rate (RBER) of most blocks have not arrived the allowable RBER based on the nominal endurance point, which implies that the conventional P/E cycle-based block retirement policies may waste large flash storage space. In this paper, to exploit the storage capacity of each flash block, we propose an RBER-aware lifetime prediction scheme based on machine learning technologies. We consider the problem that the model can lose prediction effectiveness over time and use incremental learning to update the model for adapting the changes at different lifetime stages. At run time, trained data will be gradually discarded, which can reduce memory overhead. For evaluating our purpose, four well-known machine learning techniques have been compared in terms of predictive accuracy and time overhead under our proposed lifetime prediction scheme. We also compared the predicted values with the tested values obtained in the real NAND flash-based test platform, and the experimental results show that the support vector machine (SVM) models based on our proposed lifetime prediction scheme can achieve as high as 95% accuracy for flash blocks. We also apply our proposed lifetime prediction scheme to predict the actual endurance of flash blocks at four different retention times, and the experimental results show that it can significantly improve the maximum P/E cycle of flash blocks from 37.5% to 86.3% on average. Therefore, the proposed lifetime prediction scheme can provide a guide for block endurance prediction.

  • 27.
    Marranghello, Felipe
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Callegaro, V.
    Reis, A. I.
    Ribas, R. P.
    Four-level forms for memristive material implication logic2019In: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 27, no 5, p. 1228-1232, article id 8621037Article in journal (Refereed)
    Abstract [en]

    This brief proposes the use of four-level forms in the memristive material implication (M-IMP) logic. M-IMP is a promising approach to perform stateful logic in memristive nonvolatile memories. In such a design technique, a given Boolean function is evaluated as a sequence of instructions, making logic synthesis methods necessary to attain the shortest sequence. In comparison to previous work, experimental results have shown an average reduction of 40% when evaluating the tradeoff between the numbers of instructions and memristive devices.

  • 28.
    Qin, Zidi
    et al.
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Zhu, Di
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Zhu, Xingwei
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Chen, Xuan
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Shi, Yinghuan
    Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China..
    Gao, Yang
    Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Jiangsu, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Shen, Qinghong
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Li, Li
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Pan, Hongbing
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210023, Jiangsu, Peoples R China..
    Accelerating Deep Neural Networks by Combining Block-Circulant Matrices and Low-Precision Weights2019In: ELECTRONICS, ISSN 2079-9292, Vol. 8, no 1, article id 78Article in journal (Refereed)
    Abstract [en]

    As a key ingredient of deep neural networks (DNNs), fully-connected (FC) layers are widely used in various artificial intelligence applications. However, there are many parameters in FC layers, so the efficient process of FC layers is restricted by memory bandwidth. In this paper, we propose a compression approach combining block-circulant matrix-based weight representation and power-of-two quantization. Applying block-circulant matrices in FC layers can reduce the storage complexity from <mml:semantics>O(k2)</mml:semantics> to <mml:semantics>O(k)</mml:semantics>. By quantizing the weights into integer powers of two, the multiplications in the reference can be replaced by shift and add operations. The memory usages of models for MNIST, CIFAR-10 and ImageNet can be compressed by <mml:semantics>171x</mml:semantics>, <mml:semantics>2731x</mml:semantics> and <mml:semantics>128x</mml:semantics> with minimal accuracy loss, respectively. A configurable parallel hardware architecture is then proposed for processing the compressed FC layers efficiently. Without multipliers, a block matrix-vector multiplication module (B-MV) is used as the computing kernel. The architecture is flexible to support FC layers of various compression ratios with small footprint. Simultaneously, the memory access can be significantly reduced by using the configurable architecture. Measurement results show that the accelerator has a processing power of 409.6 GOPS, and achieves 5.3 TOPS/W energy efficiency at 800 MHz.

  • 29.
    Reinhold, Ingo
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems. XaarJet ltd..
    Industrial Digital Fabrication Using Inkjet Technology2019Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    The use of acoustic waves initiated by the deformation of a microchannel is one method for generating monodisperse, micrometer-sized droplets from small orifices and is employed in piezo-electric inkjet printheads. These printheads are used in both graphical printing and digital fabrication, where functionalities, such as optical, biological, electrical or mechanical, are being produced locally. The processes leading to detrimental artifacts such as satellite droplets or nozzle outages, however, are not fully understood and require profound experimentation. This thesis presents both novel techniques to study jetting for optimal droplet formation and reliability, as well as the post-processing techniques required for solution-based production of a conductive feature on low-cost polymeric substrates.

    A multi-exposure imaging system using laser light pulses shorter than 50 ns and a MEMS micro-mirror enabled the imaging of the droplet formation at ten instances on the droplet's  travel towards the substrate. The technique allows for the study of droplet formation, satellite droplet break-up and secondary tail formation allowing for better control and understanding of the process.

    Reliability measurement using a linescan camera was introduced to record every droplet ejected from the width of a printhead. The variations in droplet velocity and misalignment of the printhead required the use of a constant background illumination to reliably capture the droplets. The resulting low-contrast images were post-processed using statistical analysis of the graylevel distributions of both, the droplet and background pixels, and were subsequently used in a histogram matching algorithm to enable reliable identification of the threshold value required for unhindered detection of missing droplets based on the printed image. Using temporal oversampling the technique was shown to qualitatively describe droplet velocity variations introduced by the actuation of the printhead.  

    The conversion of inkjet-printed metallic nanoparticle inks to conductive structures was investigated with a focus on the applicability to industrial processes. Intense pulsed light (IPL) processing achieved comparable results to convective oven sintering in less than ten seconds. The dynamics of IPL sintering were found to be strongly dependent on the spectral composition of the light resonating in the processing chamber. By implementing a passive filtering concept, thermal runaway was prevented and the line conformation was optimized irrespective of the underlying substrate. Alternatively, pulse-shaping, to tailor the energy flux into the deposit and incorporate drying in the IPL process, was found to generate conductive copper features without pre-drying.

    The findings were applied to applications comprising small droplet generation for nanoimprint lithography, the fabrication of conductors for blind via connections to buried LED dies as well as the hybrid generation of hyperbolic ion-trap electrodes for  mass spectrometry applications. The addition of the non-contact and high accuracy of the inkjet process enabled suitable performance that lies beyond that of conventional processes.

  • 30.
    Reinhold, Ingo
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems. XaarJet ltd..
    Černý, Tomáš
    Xaar plc.
    Quantitative Assessment of Inkjet Reliability under Industrial Conditions: Measuring All Drops during Extended High‐Duty Printing2018In: Handbook of Industrial Inkjet Printing / [ed] Werner Zapka, Weinheim: Wiley-VCH Verlagsgesellschaft, 2018, p. 445-458Chapter in book (Refereed)
    Abstract [en]

    Reliability is one of the key challenges in inkjet technology. Nozzles perform unreliably for a number of reasons, such as drying, clogging through air‐ and inkborne contaminants, ingestion of air, or nozzle plate flooding. To extract quantitative information about the number of missing droplets from the acquired images, suitable algorithms need to be applied. To identify the presence of the droplet, a value derived from the characteristics of the area of interest needs to be compared with the threshold value. The Line Scan approach for the measurement of reliability offers a convenient way to assess reliability of a printhead in a laboratory environment providing quantitative and statistical data about location, duration, and time of misfire events. With the knowledge of the printhead frequency and the hypothetical print resolution applied in the printing experiments, the length of such tic marks as a number of subsequent missing droplets can be calculated.

  • 31.
    Rosvall, Kathrin
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Mohammadat, Tage
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.
    Ungureanu, George
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.
    Öberg, Johnny
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Sander, Ingo
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics.
    Exploring Power and Throughput for Dataflow Applications on Predictable NoC Multiprocessors2018Conference paper (Refereed)
    Abstract [en]

    System level optimization for multiple mixed-criticality applications on shared networked multiprocessor platforms is extremely challenging. Substantial complexity arises from the interdependence between the multiple subproblems of mapping, scheduling and platform configuration under the consideration of several, potentially orthogonal, performance metrics and constraints. Instead of using heuristic algorithms and problem decomposition, novel unified design space exploration (DSE) approaches based on Constraint Programming (CP) have in the recent years shown promising results. The work in this paper takes advantage of the modularity of CP models, in order to support heterogeneous multiprocessor Network-on-Chip (NoC) with Temporally Disjoint Networks (TDNs) aware message injection. The DSE supports a range of design criteria, in particular the optimization and satisfaction of power and throughput. In addition, the DSE now provides a valid configuration for the TDNs that guarantees the performance required to fulfil the design goals. The experiments show the capability of the approach to find low-power and high-throughput designs, and validate a resulting design on a physical TDN-based NoC implementation.

  • 32.
    Rwegasira, Diana
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics. Univ Dar Es Salaam, Dar Es Salaam, Tanzania..
    Ben Dhaou, Imed
    Qassim Univ, Coll Engn, Buraydah, Saudi Arabia.;Univ Monastir, Monastir, Tunisia..
    Kondoro, Aron
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics. Univ Dar Es Salaam, Dar Es Salaam, Tanzania..
    Kelati, Amleset
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems. KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Integrated devices and circuits. Univ Turku, Turku, Finland..
    Mvungi, Nerey
    Univ Dar Es Salaam, Dar Es Salaam, Tanzania..
    Tenhunen, Hannu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Integrated devices and circuits. Univ Turku, Turku, Finland..
    A Hardware-in-Loop Simulation of DC Microgrid using Multi-Agent Systems2018In: PROCEEDINGS OF THE 2018 22ND CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT) / [ed] Balandin, S, IEEE , 2018, p. 232-237Conference paper (Refereed)
    Abstract [en]

    Smart-grid is a complex system that incorporates distributed control, communication, optimization, and management functions in addition to the legacy functions such as generation, storage, and control. The design and test of new smart-grid algorithms require an efficient simulator. Agent based simulation platforms are the most popular tools that work well in the control and monitoring functionalities of the power electric network such as the microgrid. Most existing simulation tools necessitate either simulated or static data. In this paper, we propose a hardware-in-loop simulator for dc-microgrid. The simulator reads the power generated by the PV panels and the battery SoC using Raspberry PI. A physical agent that MRS on Raspberry PI sends the real-time data to a dc-microgrid simulator that runs on a PC. As a proof of concept, we implemented a load-shedding algorithm using the proposed system.

  • 33.
    Shi, Xin
    et al.
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Hubei, Peoples R China..
    Wu, Fei
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Hubei, Peoples R China..
    Wang, Shunzhuo
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Hubei, Peoples R China..
    Xie, Changsheng
    Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Wuhan, Hubei, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Program Error Rate-based Wear Leveling for NAND Hash Memory2018In: PROCEEDINGS OF THE 2018 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), Institute of Electrical and Electronics Engineers (IEEE), 2018, p. 1241-1246Conference paper (Refereed)
    Abstract [en]

    Wear leveling scheme has became a fundamental issue in the design of Solid State Disk (SSD) based on NAND Flash memory. Existing schemes aim to equalize the number of programming/erase (P/E) cycles and memory raw bit error rates (BER) among all the flash blocks. However, due to fabrication process variation, different blocks of the same flash chip usually have largely different endurance in terns of BER and program error rate (PER). Such conventional design cannot obtain the wear status of flash blocks precisely. This paper proposes PER WE, an efficient PER-based wear leveling scheme that uses PER statistics as the measurement of Hash block wear-out pace, and performs block data swapping to improve the wear leveling efficiency. In our evaluation with four realistic workloads, PER based wear leveling scheme can achieve 17% and 9% variance of program error rate reduction, 8% and 3% program error rate reduction with 5% and 2% system performance degradation when compared to two state-of-the-art wear leveling schemes on average.

  • 34.
    Stathis, Dimitrios
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Yang, Yu
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Tewari, Saurabh
    IIT Delhi, India.
    Hemani, Ahmed
    KTH, School of Electrical Engineering and Computer Science (EECS), Electrical Engineering, Electronics and Embedded systems, Electronic and embedded systems.
    Paul, Kolin
    IIT Delhi, India.
    Grabherr, Manfred
    Uppsala University, Sweden.
    Ahmad, Rafi
    Inland University of Norway.
    Approximate Computing Applied to Bacterial Genome Identification using Self-Organizing Maps2019In: 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), IEEE Computer Society, 2019, p. 560-567, article id 8839522Conference paper (Refereed)
    Abstract [en]

    In this paper we explore the design space of a self-organizing map (SOM) used for rapid and accurate identification of bacterial genomes. This is an important health care problem because even in Europe, 70% of prescriptions for antibiotics is wrong. The SOM is trained on Next Generation Sequencing (NGS) data and is able to identify the exact strain of bacteria. This is in contrast to conventional methods that require genome assembly to identify the bacterial strain. SOM has been implemented as an synchoros VLSI design and shown to have 3-4 orders better computational efficiency compared to GPUs. To further lower the energy consumption, we exploit the robustness of SOM by successively lowering the resolution to gain further improvements in efficiency and lower the implementation cost without substantially sacrificing the accuracy. We do an in depth analysis of the reduction in resolution vs. loss in accuracy as the basis for designing a system with the lowest cost and acceptable accuracy using NGS data from samples containing multiple bacteria from the labs of one of the co-authors. The objective of this method is to design a bacterial recognition system for battery operated clinical use where the area, power and performance are of critical importance. We demonstrate that with 39% loss in accuracy in 12 bits and 1% in 16 bit representation can yield significant savings in energy and area.

  • 35.
    Törngren, Martin
    et al.
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Zhang, Xinhai
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems.
    Mohan, Naveen
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Becker, Matthias
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Svensson, Lars
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Tao, Xin
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems.
    Chen, DeJiu
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Machine Design (Div.). KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems. KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics.
    Westman, Jonas
    KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Mechatronics. KTH, School of Industrial Engineering and Management (ITM), Machine Design (Dept.), Embedded Control Systems.
    Architecting Safety Supervisors for High Levels of Automated Driving2018In: Proceeding of the 21st IEEE Int. Conf. on Intelligent Transportation Systems, IEEE, 2018Conference paper (Refereed)
    Abstract [en]

    The complexity of automated driving poses challenges for providing safety assurance. Focusing on the architecting of an Autonomous Driving Intelligence (ADI), i.e. the computational intelligence, sensors and communication needed for high levels of automated driving, we investigate so called safety supervisors that complement the nominal functionality. We present a problem formulation and a functional architecture of a fault-tolerant ADI that encompasses a nominal and a safety supervisor channel. We then discuss the sources of hazardous events, the division of responsibilities among the channels, and when the supervisor should take over. We conclude with identified directions for further work.

  • 36.
    Wang, Jian
    et al.
    Univ Elect Sci & Technol China, Chengdu 611731, Sichuan, Peoples R China..
    Guo, Shize
    Univ Elect Sci & Technol China, Chengdu 611731, Sichuan, Peoples R China..
    Chen, Zhe
    Univ Elect Sci & Technol China, Chengdu 611731, Sichuan, Peoples R China..
    Li, Yubai
    Univ Elect Sci & Technol China, Chengdu 611731, Sichuan, Peoples R China..
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    A New Parallel CODEC Technique for CDMA NoCs2018In: IEEE transactions on industrial electronics (1982. Print), ISSN 0278-0046, E-ISSN 1557-9948, Vol. 65, no 8, p. 6527-6537Article in journal (Refereed)
    Abstract [en]

    Code division multiple access (CDMA) network-on-chip (NoC) has been proposed for many-core systems due to its data transfer parallelism over communication channels. Consequently, coder-decoder (CODEC) module, which greatly impacts the performance of CDMA NoCs, attracted growing attention in recent years. In this paper, we propose a new parallel CODEC technique for CDMA NoCs. In general, by using a few simple logic circuits with small penalties in area and power, our new parallel (NPC) CODEC can execute the encoding/decoding process in parallel and thus reduce the data transfer latency. To reveal the benefits of our method for on-chip communication, we apply our NPC to CDMA NoCs and perform extensive experiments. From the results, we can find that our method outperforms existing parallel CODECs, such as Walsh-based parallel CODEC (WPC) and overloaded parallel CODEC (OPC). Specifically, it improves the critical point of communication latency (7.3% over WPC and 13.5% over OPC), reduces packet latency jitter by about 17.3% (against WPC) and 71.6% (against OPC), and improves energy efficiency by up to 41.2% (against WPC) and 59.2% (against OPC).

  • 37. Wang, S.
    et al.
    Wu, F.
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Zhou, J.
    Xie, C.
    WARD: Wear aware RAID design within SSDs2018In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 37, no 11, p. 2918-2928, article id 8493504Article in journal (Refereed)
    Abstract [en]

    Redundant arrays of independent disk (RAID) is an efficient approach to relieve reliability sacrifice caused by aggressive scale-out of solid state drives (SSDs). Unfortunately, RAID is unfriendly to SSDs due to redundant parity write and data rebuilding. This paper proposes a wear aware RAID design for SSDs, called WARD, which: 1) adaptively organizes RAID stripes according to real-time interblock unbalanced wear for relieving high performance and storage overhead caused by parity data and 2) migrates blocks about to break in advance and leaves these blocks unused to reduce data rebuilding overhead. An efficient block wear detection scheme is employed to detect block wear during the whole lifetime of SSDs. Beginning with a large stripe width RAID instead of the redundant worst-case RAID, WARD reorganizes RAID stripes once wear blocks with high bit error rates come out. WARD divides the original stripe into several short width RAID stripes according to the number of wear blocks and separates all wear blocks into different stripes. This not only reduces parity redundancy but also provides high reliability to avoid more than RAID recoverable error-prone chunks remaining in one stripe. For high wear blocks tending to wear-out, data in them are migrated in advance and then the blocks are left unused, which efficiently avoids performance shock caused by data rebuilding. A reliability model considering interblock unbalanced wear is proposed and reveals that WARD provides a high and stable reliability and greatly prolongs the lifetime of SSDs. Comprehensive experiments based on an SSDsim derivative simulator are carried out and experiment results show that WARD considerably improves system performance compared to the worst-case RAID.

  • 38.
    Wang, Zicong
    et al.
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    Chen, Xiaowen
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems. Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China.
    Lu, Zhonghai
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Guo, Yang
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    Cache Access Fairness in 3D Mesh-Based NUCA2018In: IEEE Access, E-ISSN 2169-3536, Vol. 6, p. 42984-42996Article in journal (Refereed)
    Abstract [en]

    Given the increase in cache capacity over the past few decades, cache access effciency has come to play a critical role in determining system performance. To ensure effcient utilization of the cache resources, non-uniform cache architecture (NUCA) has been proposed to allow for a large capacity and a short access latency. With the support of networks-on-chip (NoC), NUCA is often employed to organize the last level cache. However, this method also hurts cache access fairness, which denotes the degree of non-uniformity for cache access latencies. This drop in fairness can result in an increased number of cache accesses with overhigh latency, which leads to a bottleneck in system performance. This paper investigates the cache access fairness in the context of NoC-based 3-D chip architecture, and provides new insights into 3-D architecture design. We propose fair-NUCA (F-NUCA), a co-design scheme intended to optimize cache access fairness. In F-NUCA, we strive to improve fairness by equalizing cache access latencies. To achieve this goal, the memory mapping and the channel width are both redistributed non-uniformly, thereby equalizing the non-contention and contention latencies, respectively. The experimental results reveal that F-NUCA can effectively improve cache access fairness. When F-NUCA is compared with the traditional static NUCA in a simulation with PARSEC benchmarks, the average reductions in average latency and latency standard deviation are 4.64%/9.38% for a 4 x 4 x 2 mesh network, as well as 6.31%/13.51% for a 4 x 4 x 4 mesh network. In addition, a 4.0%/ 6.4% improvement in system throughput can be achieved for the two scales of mesh networks, respectively.

  • 39.
    Yu, Yang
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Dubrova, Elena
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Näslund, Mats
    KTH, School of Electrical Engineering and Computer Science (EECS).
    Tao, Sha
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems. Royal Inst Technol, Sch EECS, S-16440 Stockholm, Sweden..
    On Designing PUF-Based TRNGs with Known Answer Tests2018In: 2018 IEEE Nordic Circuits and Systems Conference, NORCAS 2018: NORCHIP and International Symposium of System-on-Chip, SoC 2018 - Proceedings / [ed] Nurmi, J Ellervee, P Mihhailov, J Jenihhin, M Tammemae, K, Institute of Electrical and Electronics Engineers (IEEE), 2018, article id 8573489Conference paper (Refereed)
    Abstract [en]

    Random numbers are widely used in cryptographic algorithms and protocols. A faulty true random number generator (TRNG) may open a door into a system in spite of cryptographic protection. It is therefore important to design TRNGs so that they can be tested at different stages of their lifetime to assure their trustworthiness. In this paper, we propose a method for designing physical unclonable function (PUF)-based TRNGs which can be tested in-field by known answer tests. We present a prototype FPGA implementation of the proposed TRNG based on an arbiter PUF which passes all NIST 800-22 statistical tests and has the minimal entropy of 0.918 estimated according to NIST 800-90B recommendations. This is a nontrivial achievement given that arbiter PUFs are notoriously hard to place in a symmetric manner in FPGAs.

  • 40.
    Yu, Yang
    et al.
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Teijeira, Victor Diges
    KTH.
    Marranghello, Felipe
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    Dubrova, Elena
    KTH, School of Electrical Engineering and Computer Science (EECS), Electronics, Electronic and embedded systems.
    One-sided countermeasures for side-channel attacks can backfire2018In: WiSec 2018 - Proceedings of the 11th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Association for Computing Machinery, Inc , 2018, p. 299-301Conference paper (Refereed)
    Abstract [en]

    Side-channel attacks are currently one of the most powerful attacks against implementations of cryptographic algorithms. They exploit the correlation between the physical measurements (power consumption, electromagnetic emissions, timing) taken at different points during the computation and the secret key. Some of the existing countermeasures offer a protection against one specific type of side channel only. We show that it can be a bad practice which can make exploitation of other side-channels easier. First, we perform a power analysis attack on an FPGA implementation of the Advanced Encryption Standard (AES) which is not protected against side-channel attacks and estimate the number of power traces required to extract its secret key. Then, we repeat the attack on AES implementations which are protected against fault injections by hardware redundancy and show that they can be broken with three times less power traces than the unprotected AES. We also demonstrate that the problem cannot be solved by complementing the duplicated module, as previously proposed. Our results show that there is a need for increasing knowledge about side-channel attacks and designing stronger countermeasures.

1 - 40 of 40
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf