kth.sePublikationer
Ändra sökning
Avgränsa sökresultatet
1234 1 - 50 av 198
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Abdollahi, Meisam
    et al.
    Iran Univ Sci & Technol, Tehran, Iran..
    Baharloo, Mohammad
    Inst Res Fundamental Sci IPM, Tehran, Iran..
    Shokouhinia, Fateme
    Amirkabir Univ Technol, Tehran, Iran..
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    RAP-NoC: Reliability Assessment of Photonic Network-on-Chips, A simulator2021Ingår i: Proceedings of the 8th ACM international conference on nanoscale computing and communication (ACM NANOCOM 2021), Association for Computing Machinery (ACM) , 2021Konferensbidrag (Refereegranskat)
    Abstract [en]

    Nowadays, optical network-on-chip is accepted as a promising alternative solution for traditional electrical interconnects due to lower transmission delay and power consumption as well as considerable high data bandwidth. However, silicon photonics struggles with some particular challenges that threaten the reliability of the data transmission process.The most important challenges can be considered as temperature fluctuation, process variation, aging, crosstalk noise, and insertion loss. Although several attempts have been made to investigate the effect of these issues on the reliability of optical network-on-chip, none of them modeled the reliability of photonic network-on-chip in a system-level approach based on basic element failure rate. In this paper, an analytical model-based simulator, called Reliability Assessment of Photonic Network-on-Chips (RAP-NoC), is proposed to evaluate the reliability of different 2D optical network-on-chip architectures and data traffic. The experimental results show that, in general, Mesh topology is more reliable than Torus considering the same size. Increasing the reliability of Microring Resonator (MR) has a more significant impact on the reliability of an optical router rather than a network.

  • 2. Agirre, J. A.
    et al.
    Etxeberria, L.
    Barbosa, R.
    Basagiannis, S.
    Giantamidis, G.
    Bauer, T.
    Ferrari, E.
    Labayen Esnaola, M.
    Orani, V.
    Öberg, Johnny
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Pereira, D.
    Proença, J.
    Schlick, R.
    Smrčka, A.
    Tiberti, W.
    Tonetta, S.
    Bozzano, M.
    Yazici, A.
    Sangchoolie, B.
    The VALU3S ECSEL project: Verification and validation of automated systems safety and security2021Ingår i: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 87, s. 104349-, artikel-id 104349Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Manufacturers of automated systems and their components have been allocating an enormous amount of time and effort in R&D activities, which led to the availability of prototypes demonstrating new capabilities as well as the introduction of such systems to the market within different domains. Manufacturers need to make sure that the systems function in the intended way and according to specifications. This is not a trivial task as system complexity rises dramatically the more integrated and interconnected these systems become with the addition of automated functionality and features to them. This effort translates into an overhead on the V&V (verification and validation) process making it time-consuming and costly. In this paper, we present VALU3S, an ECSEL JU (joint undertaking) project that aims to evaluate the state-of-the-art V&V methods and tools, and design a multi-domain framework to create a clear structure around the components and elements needed to conduct the V&V process. The main expected benefit of the framework is to reduce time and cost needed to verify and validate automated systems with respect to safety, cyber-security, and privacy requirements. This is done through identification and classification of evaluation methods, tools, environments and concepts for V&V of automated systems with respect to the mentioned requirements. VALU3S will provide guidelines to the V&V community including engineers and researchers on how the V&V of automated systems could be improved considering the cost, time and effort of conducting V&V processes. To this end, VALU3S brings together a consortium with partners from 10 different countries, amounting to a mix of 25 industrial partners, 6 leading research institutes, and 10 universities to reach the project goal.

  • 3.
    Aknesil, Can
    et al.
    KTH.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    An FPGA Implementation of 4x4 Arbiter PUF2021Ingår i: 2021 IEEE 51st international symposium on multiple-valued logic (ISMVL 2021), Institute of Electrical and Electronics Engineers (IEEE) , 2021, s. 160-165Konferensbidrag (Refereegranskat)
    Abstract [en]

    The need of protecting data and bitstreams increases in computation environments such as FPGA as a Service (FaaS). Physically Unclonable Functions (PUFs) have been proposed as a solution to this problem. In this paper, we present an implementation of Arbiter PUF with 4 x 4 switch blocks in Xilinx Series 7 FPGA, perform its statistical analysis, and compare it to other Arbiter PUF variants. We show that the presented implementation utilizes five times less area than 2 x 2 Arbiter PUF-based implementations. It is suitable for many real-world applications, including identification, authentication, key provisioning, and random number generation.

  • 4.
    Aknesil, Can
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Towards Generic Power/EM Side-Channel Attacks: Memory Leakage on General-Purpose Computers2022Ingår i: Proceedings of the 2022 IFIP/IEEE 30th international conference on very large scale integration (VLSI-SOC), Institute of Electrical and Electronics Engineers (IEEE) , 2022Konferensbidrag (Refereegranskat)
    Abstract [en]

    Today's power/EM side-channel analysis is limited by the complexity of the target hardware. We investigate the feasibility of power/EM side-channel analysis of general-purpose computers. This paper makes a step towards this goal by analyzing memory operations of Raspberry Pi 3 Model B, a widely used general-purpose IoT device that is capable of running an operating system, and shows that it is possible to extract information about the data field of memory operations from near-field EM measurements.

  • 5.
    Altayo Gonzalez, u1dr0yqp
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Stathis, Dimitrios
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Hemani, Ahmed
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Synthesis of Predictable Global NoC by Abutment in Synchoros VLSI Design2021Ingår i: Proceedings - 2021 15th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2021, Association for Computing Machinery (ACM), 2021, s. 61-66Konferensbidrag (Refereegranskat)
    Abstract [en]

    Synchoros VLSI design style has been proposed as an alternative to the standard cell-based design style; the word synchoros is derived from the Greek word choros for space. Synchoricity discretises space with a virtual grid, the way synchronicity discretises time with clock ticks. SiLago (Silicon Lego) blocks are atomic synchoros building blocks like Lego bricks. SiLago blocks absorb all metal layer details, i.e., all wires, to enable composition by abutment of valid; valid in the sense of being technology design rules compliant, timing clean and OCV ruggedized. Effectively, composition by abutment eliminates logic and physical synthesis for the end user. Like Lego system, synchoricity does need a finite number of SiLago block types to cater to different types of designs. Global NoCs are important system level design components. In this paper, we show, how with a small library of SiLago blocks for global NoCs, it is possible to automatically synthesize arbitrary global NoCs of different types, dimensions, and topology. The synthesized global NoCs are not only valid VLSI designs, but their cost metrics (area, latency, and energy) are known with post-layout accuracy in linear time. We argue that this is essential to be able to do chip-level design space exploration. We show how the abstract timing model of such global NoC SiLago blocks can be built and used to analyse the timing of global NoC links with post layout accuracy and in linear time. We validate this claim by subjecting the same VLSI designs of global NoC to commercial EDA's static timing analysis and show that the abstract timing analysis enabled by synchoros VLSI design gives the same results as the commercial EDA tools.

  • 6.
    Amagat, Jordi
    et al.
    Department of Biological and Chemical Engineering, Aarhus University, Denmark; Sino-Danish College (SDC), University of Chinese Academy of Sciences, Beijing 101400, China.
    Müller, Christoph Alexander
    Department of Biological and Chemical Engineering, Aarhus University, Denmark.
    Jensen, Bjarke Nørrehvedde
    Department of Biological and Chemical Engineering, Aarhus University, Denmark.
    Xiong, Xuya
    Interdisciplinary Nanoscience Center, iNANO, Aarhus University, Denmark.
    Su, Yingchun
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. Department of Biological and Chemical Engineering, Aarhus University, Denmark.
    Christensen, Natasja Porskjær
    Department of Biological and Chemical Engineering, Aarhus University, Denmark.
    Le Friec, Alice
    Department of Biological and Chemical Engineering, Aarhus University, Denmark.
    Dong, Mingdong
    Interdisciplinary Nanoscience Center, iNANO, Aarhus University, Denmark.
    Fang, Ying
    CAS Center for Excellence in Nanoscience, National Center for Nanoscience and Technology, Beijing 100190, People's Republic of China; CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Neuroscience, Chinese Academy of Sciences, Shanghai 200031, People's Republic of China.
    Chen, Menglin
    Department of Biological and Chemical Engineering, Aarhus University, Denmark; Interdisciplinary Nanoscience Center, iNANO, Aarhus University, Denmark.
    Injectable 2D flexible hydrogel sheets for optoelectrical/biochemical dual stimulation of neurons2023Ingår i: Biomaterials Advances, E-ISSN 2772-9508, Vol. 146, artikel-id 213284Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Major challenges in developing implanted neural stimulation devices are the invasiveness, complexity, and cost of the implantation procedure. Here, we report an injectable, nanofibrous 2D flexible hydrogel sheet-based neural stimulation device that can be non-invasively implanted via syringe injection for optoelectrical and biochemical dual stimulation of neuron. Specifically, methacrylated gelatin (GelMA)/alginate hydrogel nanofibers were mechanically reinforced with a poly(lactide-co-ε-caprolactone) (PLCL) core by coaxial electrospinning. The lubricant hydrogel shell enabled not only injectability, but also facile incorporation of functional nanomaterials and bioactives. The nanofibers loaded with photocatatlytic g-C3N4/GO nanoparticles were capable of stimulating neural cells via blue light, with a significant 36.3 % enhancement in neurite extension. Meanwhile, the nerve growth factor (NGF) loaded nanofibers supported a sustained release of NGF with well-maintained function to biochemically stimulate neural differentiation. We have demonstrated the capability of an injectable, hydrogel nanofibrous, neural stimulation system to support neural stimulation both optoelectrically and biochemically, which represents crucial early steps in a larger effort to create a minimally invasive system for neural stimulation.

  • 7.
    Amagat, Jordi
    et al.
    Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark.;Univ Chinese Acad Sci, Sinodanish Coll SDC, Beijing 101400, Peoples R China..
    Su, Yingchun
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark..
    Svejso, Frederik Hobjerg
    Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark..
    Le Friec, Alice
    Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark..
    Sonderskov, Steffan Moller
    Aarhus Univ, INANO, Interdisciplinary Nanosci Ctr, Aarhus, Denmark..
    Dong, Mingdong
    Aarhus Univ, INANO, Interdisciplinary Nanosci Ctr, Aarhus, Denmark..
    Fang, Ying
    Chinese Acad Sci, Ctr Excellence Nanosci, Natl Ctr Nanosci & Technol, Beijing 100190, Peoples R China.;Chinese Acad Sci, Inst Neurosci, CAS Ctr Excellence Brain Sci & Intelligence Techno, Shanghai 200031, Peoples R China..
    Chen, Menglin
    Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark.;Aarhus Univ, INANO, Interdisciplinary Nanosci Ctr, Aarhus, Denmark.;Aarhus Univ, Univ Byen 36, DK-8000 Aarhus, Denmark..
    Self-snapping hydrogel-based electroactive microchannels as nerve guidance conduits2022Ingår i: MATERIALS TODAY BIO, ISSN 2590-0064, Vol. 16, artikel-id 100437Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Peripheral nerve regeneration with large defects needs innovative design of nerve guidance conduits (NGCs) which possess anisotropic guidance, electrical induction and right mechanical properties in one. Herein, we present, for the first time, facile fabrication and efficient neural differentiation guidance of anisotropic, conductive, self-snapping, hydrogel-based NGCs. The hydrogels were fabricated via crosslinking of graphitic carbon nitride (g-C3N4) upon exposure with blue light, incorporated with graphene oxide (GO). Incorporation of GO and in situ reduction greatly enhanced surface charges, while decayed light penetration endowed the hydrogel with an intriguing self-snapping feature by the virtue of a crosslinking gradient. The hydrogels were in the optimal mechanical stiffness range for peripheral nerve regeneration and supported normal viability and proliferation of neural cells. The PC12 cells differentiated on the electroactive g-C3N4 H/rGO3 (3 mg/mL GO loading) hydrogel presented 47% longer neurite length than that of the pristine g-C3N4 H hydrogel. Furthermore, the NGC with aligned microchannels was successfully fabricated using sacrificial melt electrowriting (MEW) moulding, the anisotropic microchannels of the 10 mu m width showed optimal neurite guidance. Such anisotropic, electroactive, self-snapping NGCs may possess great potential for repairing peripheral nerve injuries.

  • 8.
    Aybek, Mehmet Onur
    et al.
    Arcticus Syst AB, Järfälla, Sweden..
    Jordao, Rodolfo
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Lundbäck, John
    Arcticus Syst AB, Järfälla, Sweden..
    Lundbäck, Kurt-Lennart
    Arcticus Syst AB, Järfälla, Sweden..
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    From the Synchronous Data Flow Model of Computation to an Automotive Component Model2021Ingår i: Proceedings 26th IEEE International Conference on Emerging Technologies and Factory Automation, ETFA 2021, Institute of Electrical and Electronics Engineers (IEEE) , 2021Konferensbidrag (Refereegranskat)
    Abstract [en]

    The size and complexity of automotive software systems are steadily increasing. Software functions are subject to different requirements and belong to different functional domains of the car. Meanwhile, streaming applications have become increasingly relevant in emerging application areas such as Advanced Driving Assistance Systems. Among models for streaming applications, the Synchronous Data Flow model is well-known for its analysable properties. This work presents transformation rules that allow transforming applications described by the Synchronous Data Flow model to an automotive component model. The proposed transformation rules are implemented in form of a software plugin for an automotive tool suite that allows for timing analysis, code synthesis and deployment to a Real-Time Operating System. To demonstrate the applicability of the proposed approach, a case study of a Kalman filter that is part of a simplified cruise control application is presented. An abstract Synchronous Data Flow model of the filter is transformed into a component that is deployed on an Electronic Control Unit with hard timing guarantees.

  • 9.
    Ayedh, H. M.
    et al.
    Univ Oslo, Dept Phys, POB 1048 Blindern, N-0316 Oslo, Norway.;Aalto Univ, Dept Elect & Nanoengn, Tietotie 3, FI-02150 Espoo, Finland..
    Kvamsdal, K-E
    Univ Oslo, Dept Phys, POB 1048 Blindern, N-0316 Oslo, Norway..
    Bobal, V
    Univ Oslo, Dept Phys, POB 1048 Blindern, N-0316 Oslo, Norway..
    Hallén, Anders
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Ling, F. C. C.
    Univ Hong Kong, Dept Phys, Pokfulam, Hong Kong, Peoples R China..
    Kuznetsov, A. Yu
    Univ Oslo, Dept Phys, POB 1048 Blindern, N-0316 Oslo, Norway..
    Carbon vacancy control in p(+)-n silicon carbide diodes for high voltage bipolar applications2021Ingår i: Journal of Physics D: Applied Physics, ISSN 0022-3727, E-ISSN 1361-6463, Vol. 54, nr 45, artikel-id 455106Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Controlling the carbon vacancy (V-C) in silicon carbide (SiC) is one of the major remaining bottleneck in manufacturing of high voltage SiC bipolar devices, because V-C provokes recombination levels in the bandgap, offensively reducing the charge carrier lifetime. In literature, prominent V-C evolutions have been measured by capacitance spectroscopy employing Schottky diodes, however the trade-offs occurring in the p(+)-n diodes received much less attention. In the present work, applying similar methodology, we showed that V-C is re-generated to its unacceptably high equilibrium level at similar to 2 x10(13) V-C cm(-3) by 1800 degrees C anneals required for the implanted acceptor activation in the p(+)-n components. Nevertheless, we have also demonstrated that the V-C eliminating by thermodynamic equilibrium anneals at 1500 degrees C employing carbon-cap can be readily integrated into the p(+)-n components fabrication resulting in <= 10(11) V-C cm(-3), potentially paving the way towards the realization of the high voltage SiC bipolar devices.

  • 10.
    Baccelli, Guido
    et al.
    Politecn Torino, DET, Turin, Italy..
    Stathis, Dimitrios
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Hemani, Ahmed
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Martina, Maurizio
    Politecn Torino, DET, Turin, Italy..
    NACU: A Non-Linear Arithmetic Unit for Neural Networks2020Ingår i: PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), IEEE , 2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    Reconfigurable architectures targeting neural networks are an attractive option. They allow multiple neural networks of different types to be hosted on the same hardware, in parallel or sequence. Reconfig-urability also grants the ability to morph into different micro-architectures to meet varying power-performance constraints. In this context, the need for a reconfigurable non-linear computational unit has not been widely researched. In this work, we present a formal and comprehensive method to select the optimal fixed-point representation to achieve the highest accuracy against the floating-point implementation benchmark. We also present a novel design of an optimised reconfigurable arithmetic unit for calculating non-linear functions. The unit can be dynamically configured to calculate the sigmoid, hyperbolic tangent, and exponential function using the same underlying hardware. We compare our work with the state-of-the-art and show that our unit can calculate all three functions without loss of accuracy.

  • 11.
    Backlund, Linus
    et al.
    KTH.
    Ngo, Kalle
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Gärtner, Joel
    KTH.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Secret Key Recovery Attacks on Masked and Shuffled Implementations of CRYSTALS-Kyber and SaberManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    Shuffling is a well-known countermeasure against side-channel analysis. It typically uses the Fisher-Yates (FY) algorithm to generate a random permutation which is then utilized as the loop iterator to index the processing of the variables inside the loop. The processing order is scrambled as a result, making side-channel analysis more difficult. Recently, a side-channel attack on a masked and shuffled implementation of Saber requiring 61,680 power traces to extract the secret key was reported. In this paper, we present an attack that can recover the secret key of Saber from 4,608 traces. The key idea behind the 13-fold improvement is to recover FY indexes directly, rather than by extracting the message Hamming weight and bit flipping, as in the previous attack.We capture a power trace during the execution of the decapsulation algorithm for a given ciphertext, recover FY indexes 0 and 255, and extract the corresponding two message bits. Then, we modify the ciphertext to cyclically rotate the message, capture a power trace, and extract the next two message bits with FY indexes 0 and 255. In this way, all message bits can be extracted.By recovering messages contained in $k*l$ chosen ciphertexts constructed using a new method based on error-correcting codes with length $l$, where $k$ is the security level, we recover the long term secret key. To demonstrate the generality of the presented approach, we also recover the secret key from a masked and shuffled implementation of CRYSTALS-Kyber, which NIST recently selected as a new public-key encryption and key-establishment algorithm to be standardized.

    Ladda ner fulltext (pdf)
    fulltext
  • 12.
    Baharloo, Mohammad
    et al.
    University of Tehran, Tehran, Iran.
    Khonsari, Ahmen
    University of Tehran, Tehran, Iran.
    Dolati, Mahdi
    University of Tehran, Tehran, Iran.
    Shiri, Pouya
    University of Victoria, BC, Canada.
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Rahmati, Dara
    University of Tehran, Tehran, Iran.
    Traffic-aware performance optimization in Real-time wireless network on chip2020Ingår i: Nano Communication Networks, ISSN 1878-7789, E-ISSN 1878-7797, Vol. 26, artikel-id 100321Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Network on Chip (NoC) is a prevailing communication platform for multi-core embedded systems. Wireless network on chip (WNoC) employs wired and wireless technologies simultaneously to improve the performance and power-efficiency of traditional NoCs. In this paper, we propose a deterministic and scalable arbitration mechanism for the medium access control in the wireless plane and present its analytical worst-case delay model in a certain use-case scenario that considers both Real-time (RT) and Non Real-time (NRT) flows with different packet sizes. Furthermore, we design an optimization model to jointly consider the worst-case and the average-case performance parameters of the system. The Optimization technique determines how NRT flows are allowed to use the wireless plane in a way that all RT flows meet their deadlines, and the average case delay of the WNoC is minimized. Results show that our proposed approach decreases the average latency of network flows up to 17.9%, and 11.5% in 5 × 5, and 6 × 6 mesh sizes, respectively.

    Ladda ner fulltext (pdf)
    fulltext
  • 13.
    Becker, Matthias
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Dasari, Dakshina
    Robert Bosch GmbH, Gerlingen, Germany..
    Casini, Daniel
    Scuola Super StAnna, TeCIP Inst, Pisa, Italy.;Scuola Super StAnna, Dept Excellence Robot & AI, Pisa, Italy..
    On the QNX IPC: Assessing Predictability for Local and Distributed Real-Time Systems2023Ingår i: 2023 IEEE 29TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, RTAS, Institute of Electrical and Electronics Engineers (IEEE) , 2023, s. 289-302Konferensbidrag (Refereegranskat)
    Abstract [en]

    With the advent of massively distributed applications such as those required by the IoT-to-Edge-to-Cloud compute continuum (i.e., automotive, smart agriculture, smart manufacturing, and more), real-time communication mechanisms allowing physically distributed nodes to seamlessly communicate as if they were running on the same host acquired noteworthy importance. To this end, the synchronous inter-process communication (IPC) mechanism provided by the QNX operating system (OS) is a promising candidate, as it allows using the application programming interface for communicating both on a single- and multi-node setting. Furthermore, it provides priority and partition inheritance mechanisms to improve predictability when working with the Adaptive Partitioning Scheduler (APS), a reservation-based scheduler provided by the QNX OS. This paper explores the behavior of the QNX synchronous message-passing (SyncMP) IPC with an extensive set of experiments, using them to formalize its behavior and model it from a real-time perspective. Then, it provides a response-time analysis for client-server applications based on the QNX SyncMP building upon self-suspending task theory. Finally, we evaluate the analysis on an application based on the WATERS 2019 Challenge by Bosch.

  • 14.
    Becker, Matthias
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Chen, DeJiu
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik. KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Maskinkonstruktion (Avd.).
    An adaptive resource provisioning scheme for industrial SDN networks2019Ingår i: IEEE International Conference on Industrial Informatics (INDIN), Institute of Electrical and Electronics Engineers Inc. , 2019, s. 877-880Konferensbidrag (Refereegranskat)
    Abstract [en]

    Many industrial domains face the challenge of ever growing networks, driven for example by Internet-of-Things and Industry 4.0. This typically comes together with increased network configuration and management efforts. In addition to the increasing network size, these domains typically are subject to adaptive load situations that pose an additional challenge on the network infrastructure.Software defined networking (SDN) is a promising networking paradigm that reduces configuration complexity and management effort in Ethernet networks. In this work, we investigate SDN in context of adaptive scenarios with QoS constraints. Our approach applies monitoring of several thresholds which automatically trigger redistribution of resources via the central SDN controller. This setup leads to an agile system that can dynamically react to load changes while the infrastructure is not overprovisioned. The approach is implemented in a low-level simulation environment where we demonstrate the benefits of the approach using a case study.

  • 15.
    Becker, Matthias
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Chen, DeJiu
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Maskinkonstruktion (Avd.). KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Towards QoS-Aware Service-Oriented Communication in E/E Automotive Architectures2018Ingår i: Proceedings of the 44th Annual Conference of the IEEE Industrial Electronics Society (IECON), Institute of Electrical and Electronics Engineers (IEEE), 2018, s. 4096-4101, artikel-id 8591521Konferensbidrag (Refereegranskat)
    Abstract [en]

    With the raise of increasingly advanced driving assistance systems in modern cars, execution platforms that build on the principle of service-oriented architectures are being proposed. Alongside, service oriented communication is used to provide the required adaptive communication infrastructure on top of automotive Ethernet networks. A middleware is proposed that enables QoS aware service-oriented communication between software components, where the prescribed behavior of each software component is defined by Assume/Guarantee (A-G) contracts. To enable the use of COTS components, that are often not sufficiently verified for the use in automotive systems, the middleware monitors the communication behavior of components and verifies it against the components A/G contract. A violation of the allowed communication behavior then triggers adaption processes in the system while the impact on other communication is minimized. The applicability of the approach is demonstrated by a case study that utilizes a prototype implementation of the proposed approach.

  • 16.
    Becker, Matthias
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Mubeen, Saad
    Mälardalen University.
    Timing Analysis Driven Design-Space Exploration of Cause-Effect Chains in Automotive Systems2018Ingår i: IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society, 2018Konferensbidrag (Refereegranskat)
    Abstract [en]

    Model-based development and component-based software engineering have emerged as a promising approach to deal with enormous software complexity in automotive systems. This approach supports the development of software architectures by interconnecting (and reusing) software components (SWCs) at various abstraction levels. Automotive software architectures are often modeled with chains of SWCs, also called cause-effect chains that are constrained by timing requirements. Based on the variations in activation patterns of SWCs, a single model of a cause-effect chain at a higher abstraction level can conform to several valid refined models of the chain at a lower abstraction level, which is closer to the system implementation. As a consequence, the total number of valid implementation-level models generated by the existing techniques increases exponentially, thereby significantly increasing the runtime of the timing analysis engines and liming the scalability of the existing techniques. This paper computes an upper bound on the activation pattern combinations that may result from a system of cause-effect chains in a given high-level model of the software architecture. An efficient algorithm is presented that traverses only a reduced number of possible combinations of the cause-effect chains, resulting in the timing analysis of a significantly lower number of implementation-level models of the software architecture. A proof of concept is provided by conducting a case study that shows significant reduction in the runtime of timing analysis engines, i.e., the timing behavior of the considered system is verified by performing the timing analysis of only 27% of all possible combinations of the cause-effect chains.

  • 17. Ben Dhaou, I.
    et al.
    Kondoro, Aron
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system. University of Dar es Salaam, Tanzania.
    Kelati, Amleset
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. University of Turku, Finland.
    Rwegasira, Diana
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system. University of Turku, Finland.
    Naiman, S.
    Mvungi, N. H.
    Tenhunen, Hannu
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Communication and security technologies for smart grid2018Ingår i: Fog Computing: Breakthroughs in Research and Practice, IGI Global , 2018, s. 305-331Kapitel i bok, del av antologi (Övrigt vetenskapligt)
    Abstract [en]

    The smart grid is a new paradigm that aims to modernize the legacy power grid. It is based on the integration of ICT technologies, embedded system, sensors, renewable energy and advanced algorithms for management and optimization. The smart grid is a system of systems in which communication technology plays a vital role. Safe operations of the smart grid need a careful design of the communication protocols, cryptographic schemes, and computing technology. In this article, the authors describe current communication technologies, recently proposed algorithms, protocols, and architectures for securing smart grid communication network. They analyzed in a unifying approach the three principles pillars of smart-gird: Sensors, communication technologies, and security. Finally, the authors elaborate open issues in the smart-grid communication network.

  • 18.
    Bitalebi, Hossein
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik.
    Geraeinejad, Vahid
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik.
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Near LLC versus near main memory processing2022Ingår i: ACM Int. Conf. Proc. Ser., Association for Computing Machinery (ACM) , 2022, s. 1-6Konferensbidrag (Refereegranskat)
    Abstract [en]

    Emerging advanced applications, such as deep learning and graph processing, with enormous processing demand and massive memory requests call for a comprehensive processing system or advanced solutions to address these requirements. Near data processing is one of the promising structures targeting this goal. However, most recent studies have focused on processing instructions near the main memory data banks while ignoring the benefits of processing instructions near other memory hierarchy levels such as LLC. In this study, we investigate the near LLC processing structures, and compare it to the near main memory processing alternative, specifically in graphics processing units. We analyze these two structures on various applications in terms of performance and power. Results show a clear benefit of near LLC processing over near main memory processing in a class of applications. Further, we suggest an architecture, which could benefit from both near main memory and near LLC processing structures, but requiring the applications to be characterized in advance or at run time.

  • 19.
    Bitalebi, Hossein
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Geraeinejad, Vahid
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Safaei, Farshad
    Shahid Beheshti University Iran, Tehran.
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    LATOA: Load-Aware Task Offloading and Adoption in GPU2023Ingår i: Proceedings of the 15th Workshop on General Purpose Processing Using GPU, GPGPU 2023, Association for Computing Machinery (ACM) , 2023, s. 7-13Konferensbidrag (Refereegranskat)
    Abstract [en]

    The emerging new applications, such as data mining and graph analysis, demand extra processing power at the hardware level. Conventional static task scheduling is no longer able to meet the requirements of such complicated applications. This inefficiency is a major concern when the application is supposed to run on a Graphics Processing Unit (GPU), where millions of instructions should be distributed among a limited number of processing cores. A non-optimal scheduling strategy leads to unfair load distribution among the GPU’s processing cores. Consequently, while busy cores are stalled due to the lack of resources, waiting for their data from the main memory, other cores are idle, waiting for busy cores to complete their tasks. Our study introduces LATOA, a Load-Aware Task Offloading and Adoption method that tackles this problem by reducing both stall and idle cycles. LATOA is the first study moving from static to dynamic task scheduling based on run-time information obtained from the Miss Status Holding Register (MSHR) tables. In LATOA, all processing cores are dynamically tagged with critical, neutral, or relaxed states. Then, irregular warps with low locality properties are detected and offloaded from critical cores (going to the stall state) to relaxed ones (going to the idle state). Based on our experiments, LATOA reduces the number of stall cycles on average by 24% and increases the neutral states on average by 38%. In addition, with negligible hardware overhead, LATOA improves system performance and power efficiency on average by 26% and 7%, respectively.

  • 20.
    Brisfors, Martin
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Forsmark, Sebastian
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    How Deep Learning Helps Compromising USIM2021Ingår i: Smart Card Research and Advanced Applications, CARDIS 2020 / [ed] Liardet, PY Mentens, N, Springer Nature , 2021, Vol. 12609, s. 135-150Konferensbidrag (Refereegranskat)
    Abstract [en]

    It is known that secret keys can be extracted from some USIM cards using Correlation Power Analysis (CPA). In this paper, we demonstrate a more advanced attack on USIMs, based on deep learning. We show that a Convolutional Neural Network (CNN) trained on one USIM can recover the key from another USIM using at most 20 traces (four traces on average). Previous CPA attacks on USIM cards required high-quality oscilloscopes for power trace acquisition, an order of magnitude more traces from the victim card, and expert-level skills from the attacker. Now the attack can be mounted with a $1000 budget and basic skills in side-channel analysis.

  • 21.
    Brisfors, Martin
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Moraitis, Michail
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Do Not Rely on Clock Randomization: A Side-Channel Attack on a Protected Hardware Implementation of AES2023Ingår i: FPS 2022: Foundations and Practice of Security / [ed] Jourdan, GV Mounier, L Adams, C Sedes, F Garcia-Alfaro, J, Springer Nature , 2023, Vol. 13877, s. 38-53Konferensbidrag (Refereegranskat)
    Abstract [en]

    Clock randomization is one of the oldest countermeasures against side-channel attacks. Various implementations have been presented in the past, along with positive security evaluations. However, in this paper we show that it is possible to break countermeasures based on a randomized clock by sampling side-channel measurements at a frequency much higher than the encryption clock, synchronizing the traces with pre-processing, and targeting the beginning of the encryption. We demonstrate a deep learning-based side-channel attack on a protected FPGA implementation of AES which can recover a subkey from less than 500 power traces. In contrast to previous attacks on FPGA implementations of AES which targeted the last round, the presented attack uses the first round as the attack point. Any randomized clock countermeasure is significantly weakened by an attack on the first round because the effect of randomness accumulated over multiple encryption rounds is lost.

  • 22. Bucaioni, A.
    et al.
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Lundback, J.
    Mackamul, H.
    From AMALTHEA to RCM and Back: A Practical Architectural Mapping Scheme2020Ingår i: Proceedings - 46th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2020, Institute of Electrical and Electronics Engineers (IEEE) , 2020, s. 537-544Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper focuses on the mapping between two industrial architectural languages: AMALTHEA and Rubus Component Model. Both languages are heavily used within the automotive domain for the design and timing analysis of automotive software, respectively. The main contribution of this paper is a mapping scheme between the two architectural languages enabling i) the translation of an AMALTHEA architecture into a Rubus Component Model architecture where high-precision timing analysis can be performed ii) and the back-propagation of the analysis results on the AMALTHEA architecture. We validate the applicability of the proposed mapping scheme using an industrial use case from the automotive domain: the brake-by-wire system. We discuss the industrial relevance and lessons learnt of this work using expert interviews.

  • 23.
    Bucaioni, Alessio
    et al.
    Mälardalen Univ, Box 883, S-72123 Västerås, Sweden..
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Enabling automated integration of architectural languages: An experience report from the automotive domain2022Ingår i: Journal of Systems and Software, ISSN 0164-1212, E-ISSN 1873-1228, Vol. 184, s. 111106-, artikel-id 111106Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Modern automotive software systems consist of hundreds of heterogeneous software applications, belonging to separated function domains and often developed within distributed automotive ecosystems consisting of original equipment manufactures, tier-1 and tier-2 companies. Hence, the development of modern automotive software systems is a formidable challenge. A well-known instrument for coping with the tremendous heterogeneity and complexity of modern automotive software systems is the use of architectural languages as a way of enabling different and specific views over these systems. However, the use of different architectural languages might come with the cost of reduced interoperability and automation as different languages might have weak to no integration. In this article, we tackle the challenge of integrating two architectural languages heavily used in the automotive domain for the design and timing analysis of automotive software systems: AMALTHEA and Rubus Component Model. The main contributions of this paper are (i) a mapping scheme for the translation of an AMALTHEA architecture into a Rubus Component Model architecture where highprecision timing analysis can be run, and the back annotation of the analysis results on the starting AMALTHEA architecture; (ii) the implementation of the proposed scheme, which uses the concept of model transformations for enabling a full-fledged automated integration; (iii) the application of such automation on three industrial automotive systems being the brake-by-wire, the full blown engine management system and the engine management system. We discuss and evaluate the proposed contributions using an online, experts survey and the above-mentioned use cases. Based on the evaluation results, we conclude that the proposed automation mechanism is correct and applicable in industrial contexts. Besides, we observe that the performance of the automation mechanism does not degrade when translating large models with several thousands of elements. Eventually, we conclude that experts in this field find the proposed contribution industrially relevant.

  • 24. Bucaioni, Alessio
    et al.
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Lundbäck, John
    Mackamul, Harald
    From AMALTHEA to RCM and Back: a Practical Architectural Mapping Scheme2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper focuses on the mapping between twoindustrial architectural languages: AMALTHEA and RubusComponent Model. Both languages are heavily used within theautomotive domain for the design and timing analysis of automo-tive software, respectively. The main contribution of this paperis a mapping scheme between the two architectural languagesenabling i) the translation of an AMALTHEA architecture intoa Rubus Component Model architecture where high-precisiontiming analysis can be performed ii) and the back-propagationof the analysis results on the AMALTHEA architecture. Wevalidate the applicability of the proposed mapping scheme usingan industrial use case from the automotive domain: the brake-by-wire system. We discuss the industrial relevance and lessonslearnt of this work using expert interviews

  • 25. Charif, Amir
    et al.
    Coelho, Alexandre
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. KTH.
    Bagherzadeh, Nader
    Zergainoh, Nacer-Eddine
    First-Last: A Cost-Effective Adaptive Routing Solution for TSV-Based Three-Dimensional Networks-on-Chip2018Ingår i: IEEE Transactions on Computers, ISSN 0018-9340, E-ISSN 1557-9956, Vol. 67, nr 10, s. 1430-1444Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    3D integration opens up new opportunities for future multiprocessor chips by enabling fast and highly scalable 3DNetwork-on-Chip (NoC) topologies. However, in an aim to reduce the cost of Through-silicon via (TSV), partially vertically connectedNoCs, in which only a few vertical TSV links are available, have been gaining relevance. To reliably route packets under suchconditions, we introduce a lightweight, efficient and highly resilient adaptive routing algorithm targeting partially vertically connected3D-NoCs named First-Last. It requires a very low number of virtual channels (VCs) to achieve deadlock-freedom (2 VCs in the Eastand North directions and 1 VC in all other directions), and guarantees packet delivery as long as one healthy TSV connecting all layersis available anywhere in the network. An improved version of our algorithm, named Enhanced-First-Last is also introduced and shownto dramatically improve performance under low TSV availability while still using less virtual channels than state-of-the-art algorithms. Acomprehensive evaluation of the cost and performance of our algorithms is performed to demonstrate their merits with respects toexisting solutions.

  • 26.
    Chen, DeJiu
    et al.
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Inbyggda styrsystem. KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Maskinkonstruktion (Avd.). KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion (Inst.), Mekatronik.
    Östberg, Kenneth
    RISE - Research Institutes of Sweden.
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Sivencrona, Håkan
    Zenuity AB.
    Warg, Fredrik
    RISE - Research Institutes of Sweden.
    Design of a Knowledge-Base Strategy for Capability-Aware Treatment of Uncertainties of Automated Driving Systems2018Ingår i: Computer Safety, Reliability, and Security. / [ed] Gallina B., Skavhaug A., Schoitsch E., Bitsch F., Cham, 2018, Vol. 11094Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automated Driving Systems (ADS) represent a key technological advancement in the area of Cyber-physical systems (CPS) and Embedded Control Systems (ECS) with the aim of promoting traffic safety and environmental sustainability. The operation of ADS however exhibits several uncertainties that if improperly treated in development and operation would lead to safety and performance related problems. This paper presents the design of a knowledge-base (KB) strategy for a systematic treatment of such uncertainties and their system-wide implications on design-space and state-space. In the context of this approach, we use the term Knowledge-Base (KB) to refer to the model that stipulates the fundamental facts of a CPS in regard to the overall system operational states, action sequences, as well as the related costs or constraint factors. The model constitutes a formal basis for describing, communicating and inferring particular operational truths as well as the belief and knowledge representing the awareness or comprehension of such truths. For the reasoning of ADS behaviors and safety risks, each system operational state is explicitly formulated as a conjunction of environmental state and some collective states showing the ADS capabilities for perception, control and actuations. Uncertainty Models (UM) are associated as attributes to such state definitions for describing and quantifying the corresponding belief or knowledge status due to the presences of evidences about system performance and deficiencies, etc. On a broader perspective, the approach is part of our research on bridging the gaps among intelligent functions, system capability and dependability for mission-&safety-critical CPS, through a combination of development- and run-time measures.

  • 27.
    Chen, Hui
    et al.
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210093, Peoples R China..
    Cheng, Kaifeng
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210093, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Fu, Yuxiang
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210093, Peoples R China..
    Li, Li
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing 210093, Peoples R China..
    Hyperbolic CORDIC-Based Architecture for Computing Logarithm and Its Implementation2020Ingår i: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 67, nr 11, s. 2652-2656Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a CORDIC (Coordinate Rotation Digital Computer)-based method to compute the logarithm function with base 2 and validate this method by software simulation and hardware implementation. Technically, we overcome the limitation of traditional hyperbolic CORDIC and transform it based on the idea of generalized hyperbolic CORDIC so that it can be used to compute $log_{2}x\;(x\;\epsilon \;[1,2))$ . The proposed method requires only simple shift-and-add operations and has a great tradeoff between precision (or speed) and area. In MATLAB, we provide different precisions corresponding to the iterations of the transformed CORDIC for user needs. Using a pipelined structure and setting the number of iterations to be 16 (the average relative error is $2.09\times 10<^>{-6}$ ), we implement an example hardware circuit. Synthesized under the SMIC 65nm CMOS technology, the circuit has an area of 24100 $\mu m<^>{2}$ and computation time of 11.1 ns, which can save 31.04x0025; area and improve 6.92x0025; computation speed averagely compared with existing methods.

  • 28.
    Chen, Hui
    et al.
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    Jiang, Lin
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    Luo, Yuanyong
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Fu, Yuxiang
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    Li, Li
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    Yu, Zongguang
    Nanjing Univ, Sch Elect Sci & Engn, Nanjing, Peoples R China..
    A CORDIC-Based Architecture with Adjustable Precision and Flexible Scalability to Implement Sigmoid and Tanh Functions2020Ingår i: IEEE International Symposium on Circuits and Systems, ISCAS 2020, IEEE , 2020Konferensbidrag (Refereegranskat)
    Abstract [en]

    In the artificial neural networks, tanh (hyperbolic tangent) and sigmoid functions are widely used as activation functions. Past methods to compute them may have shortcomings such as low precision or inflexible architecture that is difficult to expand, so we propose a CORDIC-based architecture to implement sigmoid and tanh functions, which has adjustable precision and flexible scalability. It just needs shift-add-or-subtract operations to compute high-accuracy results and is easy to expand the input range through scaling the negative iterations of CORDIC without changing the original architecture. We adopt the control variable method to explore the accuracy distribution through software simulation. A specific case (ARCH:(1, 15, 18), RMSE: 10(-6)) is designed and synthesized under the TSMC 40nm CMOS technology, the report shows that it has the area of 36512.78 mu m(2) and power of 12.35mW at the frequency of 1GHz. The maximum work frequency can reach 1.5GHz, which is better than the state-of-the-art methods.

  • 29. Chen, Hui
    et al.
    Yang, Heping
    Song, Wenqing
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Fu, Yuxiang
    Li, Li
    Yu, Zongguang
    Symmetric-Mapping LUT-Based Method and Architecture for Computing X-Y-Like Functions2021Ingår i: IEEE Transactions on Circuits and Systems Part 1: Regular Papers, ISSN 1549-8328, E-ISSN 1558-0806, Vol. 68, nr 3, s. 1231-1244Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We propose a new method and hardware architecture to compute the functions expressed as XY ( X and Y are arbitrary floating-point numbers), which can support arbitrary Nth root, exponential and power operations. Because of the complexity of direct computation, we usually convert it to logarithm, multiplication, and antilogarithm operations. Traditional approaches suffer from long latency, large area and high power consumption. To solve this problem, we propose a symmetric-mapping lookup table (SM-LUT) to be capable of computing log(2) x (x is an element of [1, 2]) and 2 x (x is an element of [0, 1]) simultaneously. It lays the foundation for computing XY. To further improve hardware performance of our architecture, we propose a multi-region address searcher to speed up the calculation of SM-LUT. In addition, we use an optimized Vedic multiplier to shorten the critical path and improve the efficiency of multiplication, which is included in computing X-Y. Under the TSMC 40nm CMOS technology, we design and synthesize a reference circuit to compute X-Y with a maximum relative error of 10(-3). The report shows that the reference circuit achieves the area of 14338.50 mu m(2) and the power consumption of 4.59 mW at the frequency of 1 GHz. In comparison with the state-of-the-art work under the same input range and similar precision, it saves 78.57% area and 80.42% power consumption for (N)root R computation and 82.89% area and 81.89% power consumption for R-N computation averagely. On top of that, our architecture reduces the computation latency by 62.77% averagely and has one more order of magnitude of energy efficiency than others.

  • 30. Chen, Kun-Chih (Jimmy)
    et al.
    Ebrahimi, Masoumeh
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Wang, Ting-Yi
    Yang, Yuch-Chi
    NoC-based DNN Accelerator: A Future Design Paradigm2019Ingår i: Proceedings of the 13th IEEE/ACM International Symposium on Networks-on-Chip, NOCS 2019, Association for Computing Machinery (ACM), 2019Konferensbidrag (Refereegranskat)
    Abstract [en]

    Deep Neural Networks (DNN) have shown significant advantagesin many domains such as pattern recognition, prediction, and controloptimization. The edge computing demand in the Internet-of-Things era has motivated many kinds of computing platforms toaccelerate the DNN operations. The most common platforms areCPU, GPU, ASIC, and FPGA. However, these platforms suffer fromlow performance (i.e., CPU and GPU), large power consumption(i.e., CPU, GPU, ASIC, and FPGA), or low computational flexibilityat runtime (i.e., FPGA and ASIC). In this paper, we suggest theNoC-based DNN platform as a new accelerator design paradigm.The NoC-based designs can reduce the off-chip memory accessesthrough a flexible interconnect that facilitates data exchange betweenprocessing elements on the chip. We first comprehensivelyinvestigate conventional platforms and methodologies used in DNNcomputing. Then we study and analyze different design parametersto implement the NoC-based DNN accelerator. The presentedaccelerator is based on mesh topology, neuron clustering, randommapping, and XY-routing. The experimental results on LeNet, MobileNet,and VGG-16 models show the benefits of the NoC-basedDNN accelerator in reducing off-chip memory accesses and improvingruntime computational flexibility.

    Ladda ner fulltext (pdf)
    fulltext
  • 31. Chen, S.
    et al.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Hardware acceleration of multilayer perceptron based on inter-layer optimization2019Ingår i: Proceedings - 2019 IEEE International Conference on Computer Design, ICCD 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, s. 164-172Konferensbidrag (Refereegranskat)
    Abstract [en]

    Multilayer Perceptron (MLP) is used in a broad range of applications. Hardware acceleration of MLP is one most promising way to provide better performance-energy efficiency. Previous works focused on the intra-layer optimization and layer-after-layer processing, while leaving the inter-layer optimization never studied. In this paper, we propose hardware acceleration of MLPs based on inter-layer optimization which allows us to overlap the execution of MLP layers. First we describe the inter-layer optimization from software and mathematical perspectives. Then, a reference Two-Neuron architecture which is efficient to support the inter-layer optimization is proposed and implemented. Discussions about area cost, performance and energy consumption are carried out to explore the scalability of the Two-Neuron architecture. Results show that the proposed MLP design optimized across layers achieves better performance and energy efficiency than the conventional intra-layer optimized designs. As such, the inter-layer optimization provides another possible direction other than the intra-layer optimization to gain further performance and energy improvements for the hardware acceleration of MLPs.

  • 32.
    Chen, Xiaowen
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Efficient Memory Access and Synchronization in NoC-based Many-core Processors2019Doktorsavhandling, monografi (Övrigt vetenskapligt)
    Abstract [en]

    In NoC-based many-core processors, memory subsystem and synchronization mechanism are always the two important design aspects, since mining parallelism and pursuing higher performance require not only optimized memory management but also efficient synchronization mechanism. Therefore, we are motivated to research on efficient memory access and synchronization in three topics, namely, efficient on-chip memory organization, fair shared memory access, and efficient many-core synchronization.

    One major way of optimizing the memory performance is constructing a suitable and efficient memory organization. A distributed memory organization is more suitable to NoC-based many-core processors, since it features good scalability. We envision that it is essential to support Distributed Shared Memory (DSM) because of the huge amount of legacy code and easy programming. Therefore, we first adopt the microcoded approach to address DSM issues, aiming for hardware performance but maintaining the flexibility of programs. Second, we further optimize the DSM performance by reducing the virtual-to-physical address translation overhead. In addition to the general-purpose memory organization such as DSM, there exists special-purpose memory organization to optimize the performance of application-specific memory access. We choose Fast Fourier Transform (FFT) as the target application, and propose a multi-bank data memory specialized for FFT computation.

    In 3D NoC-based many-core processors, because processor cores and memories reside in different locations (center, corner, edge, etc.) of different layers, memory accesses behave differently due to their different communication distances. As the network size increases, the communication distance difference of memory accesses becomes larger, resulting in unfair memory access performance among different processor cores. This unfair memory access phenomenon may lead to high latencies of some memory accesses, thus negatively affecting the overall system performance. Therefore, we are motivated to study on-chip memory and DRAM access fairness in 3D NoC-based many-core processors through narrowing the round-trip latency difference of memory accesses as well as reducing the maximum memory access latency.

    Barrier synchronization is used to synchronize the execution of parallel processor cores. Conventional barrier synchronization approaches such as master-slave, all-to-all, tree-based, and butterfly are algorithm oriented. As many processor cores are networked on a single chip, contended synchronization requests may cause large performance penalty. Motivated by this, different from the algorithm-based approaches, we choose another direction (i.e., exploiting efficient communication) to address the barrier synchronization problem. We propose cooperative communication as a means and combine it with the master-slave algorithm and the all-to-all algorithm to achieve efficient many-core barrier synchronization. Besides, a multi-FPGA implementation case study of fast many-core barrier synchronization is conducted.

    Ladda ner fulltext (pdf)
    Doctoral_Thesis_Xiaowen_Chen_20190106.pdf
  • 33.
    Chen, Xiaowen
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS). Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China.
    Lei, Yuanwu
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Chen, Shuming
    Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China..
    A Variable-Size FFT Hardware Accelerator Based on Matrix Transposition2018Ingår i: IEEE Transactions on Very Large Scale Integration (vlsi) Systems, ISSN 1063-8210, E-ISSN 1557-9999, Vol. 26, nr 10, s. 1953-1966Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Fast Fourier transform (FFT) is the kernel and the most time-consuming algorithm in the domain of digital signal processing, and the FFT sizes of different applications are very different. Therefore, this paper proposes a variable-size FFT hardware accelerator, which fully supports the IEEE-754 single-precision floating-point standard and the FFT calculation with a wide size range from 2 to 220 points. First, a parallel Cooley-Tukey FFT algorithm based on matrix transposition (MT) is proposed, which can efficiently divide a large size FFT into several small size FFTs that can be executed in parallel. Second, guided by this algorithm, the FFT hardware accelerator is designed, and several FFT performance optimization techniques such as hybrid twiddle factor generation, multibank data memory, block MT, and token-based task scheduling are proposed. Third, its VLSI implementation is detailed, showing that it can work at 1 GHz with the area of 2.4 mm(2) and the power consumption of 91.3 mW at 25 degrees C, 0.9 V. Finally, several experiments are carried out to evaluate the proposal's performance in terms of FFT execution time, resource utilization, and power consumption. Comparative experiments show that our FFT hardware accelerator achieves at most 18.89x speedups in comparison to two software-only solutions and two hardware-dedicated solutions.

  • 34.
    Chen, Yancang
    et al.
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Xie, Lunguo
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Li, Jinwen
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Shi, Zhu
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Zhang, Minxuan
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Chen, Xiaowen
    Natl Univ Def Technol, Dept Comp, Changsha, Hunan, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    A Trace-driven Hardware-level Simulator for Design and Verification of Network-on-Chips2010Ingår i: 2011 INTERNATIONAL CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND AUTOMATION (CCCA 2011), VOL II / [ed] Thaung, K S, IEEE , 2010, s. 32-35Konferensbidrag (Refereegranskat)
    Abstract [en]

    Traditional communications of general-purpose multi-core processor and application-specific System-on-Chip face challenges in terms of scalability and complexity. Network-on-Chip (NoC) has been the most promising solution for the communications of multi-core and many-core chips. In this paper, we present a trace-driven hardware-level simulator (noted HS) based on SystemVerilog for the design and verification of NoCs. Different from the state-of-the-art NoC simulators, the HS owns three important characteristics in addition to the capability of creating simulation and synthesizable NoC descriptions: 1) hardware-level simulation can be done, which means more implementation details of hardware than flit-level simulation; 2) router debugging and verification can be done at RTL by inserting assertions and coverage; 3) trace-based application simulations can be done besides synthetic workloads. A 4 X 4 2D mesh NoC with output virtual-channel routers verifies the capability of our HS.

  • 35.
    Chen, Yizhi
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Nevarez, Yarib
    University of Bremen, Institute of Electrodynamics and Microelectronics (ITEM.ids), Bremen, Germany.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Garcia-Ortiz, Alberto
    KTH, Skolan för elektroteknik och datavetenskap (EECS).
    Accelerating Non-Negative Matrix Factorization on Embedded FPGA with Hybrid Logarithmic Dot-Product Approximation2022Ingår i: Proceedings: 2022 IEEE 15th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2022, Institute of Electrical and Electronics Engineers (IEEE) , 2022, s. 239-246Konferensbidrag (Refereegranskat)
    Abstract [en]

    Non-negative matrix factorization (NMF) is an ef-fective method for dimensionality reduction and sparse decom-position. This method has been of great interest to the scien-tific community in applications including signal processing, data mining, compression, and pattern recognition. However, NMF implies elevated computational costs in terms of performance and energy consumption, which is inadequate for embedded applications. To overcome this limitation, we implement the vector dot-product with hybrid logarithmic approximation as a hardware optimization approach. This technique accelerates floating-point computation, reduces energy consumption, and preserves accuracy. To demonstrate our approach, we employ a design exploration flow using high-level synthesis on an embedded FPGA. Compared with software solutions on ARM CPU, this hardware implementation accelerates the overall computation to decompose matrix by 5.597 × and reduces energy consumption by 69.323×. Log approximation NMF combined with KNN(k-nearest neighbors) has only 2.38% decreasing accuracy compared with the result of KNN processing the matrix after floating-point NMF on MNIST. Further on, compared with a dedicated floating-point accelerator, the logarithmic approximation approach achieves 3.718× acceleration and 8.345× energy reduction. Compared with the fixed-point approach, our approach has an accuracy degradation of 1.93% on MNIST and an accuracy amelioration of 28.2% on the FASHION MNIST data set without pre-knowledge of the data range. Thus, our approach has better compatibility with the input data range.

  • 36.
    Chen, Yizhi
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Zhu, Wenyao
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Chen, DeJiu
    KTH, Skolan för industriell teknik och management (ITM), Maskinkonstruktion, Mekatronik och inbyggda styrsystem.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Online Image Sensor Fault Detection for Autonomous Vehicles2022Ingår i: Proceedings: 2022 IEEE 15th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2022, Institute of Electrical and Electronics Engineers Inc. , 2022, s. 120-127Konferensbidrag (Refereegranskat)
    Abstract [en]

    Automated driving vehicles have shown glorious potential in the near future market due to the high safety and convenience for drivers and passengers. Image sensors' reliability attract many researchers' interests as many image sensors are used in autonomous vehicles. We propose an online image sensor fault detection method based on comparing the historical variances of normal pixels and defective pixels to detect faults. For fault pixels without uncertainty, with a detecting window of more than 30 frames, we get 100% accuracy and 100% recall on realistic continuous traffic pictures from the KITTI data set. We also explore the influence of fault pixel values' uncertainty from 0% to 25% and study different fixed thresholds and a dynamic threshold for judgments. Strict threshold, which is 0.1, has a high accuracy (99.16%) but has a low recall (34.46%) for 15% uncertainty. Loose threshold, which is 0.3, has a relatively high recall (83.78%) but mistakes too many normal pixels with 18.17% accuracy for 15% uncertainty. Our dynamic threshold balances the accuracy and recall. It gets 100% accuracy and 58.69% recall for 5% uncertainty and 78.38% accuracy and 55.39% recall for 15% uncertainty. Based on the detected damage pixel rate, we develop a health score for evaluating the image sensor system intuitively. It can also be helpful for making decision about replacing cameras.

  • 37.
    Chen, Zhe
    et al.
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Guo, Shize
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Wang, Jian
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Li, Yubai
    Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Sichuan, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Toward FPGA Security in IoT: A New Detection Technique for Hardware Trojans2019Ingår i: IEEE Internet of Things Journal, ISSN 2327-4662, Vol. 6, nr 4, s. 7061-7068Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Nowadays, field programmable gate array (FPGA) has been widely used in Internet of Things (IoT) since it can provide flexible and scalable solutions to various IoT requirements. Meanwhile, hardware Trojan (HT), which may lead to undesired chip function or leak sensitive information, has become a great challenge for FPGA security. Therefore, distinguishing the Trojan-infected FPGAs is quite crucial for reinforcing the security of IoT. To achieve this goal, we propose a clock-tree-concerned technique to detect the HTs on FPGA. First, we present an experimental framework which helps us to collect the electromagnetic (EM) radiation emitted by FPGA clock tree. Then, we propose a Trojan identifying approach which extracts the mathematical feature of obtained EM traces, i.e., 2-D principal component analysis (2DPCA) in this paper, and automatically isolates the Trojan-infected FPGAs from the Trojan-free ones by using a BP neural network. Finally, we perform extensive experiments to evaluate the effectiveness of our method. The results reveal that our approach is valid in detecting HTs on FPGA. Specifically, for the trust-hub benchmarks, we can find out the FPGA with always on Trojans (100% detection rate) while identifying the triggered Trojans with high probability (by up to 92%). In addition, we give a thorough discussion on how the experimental setup, such as probe step size, scanning area, and chip ambient temperature, affects the Trojan detection rate.

  • 38. Cotronis, Y.
    et al.
    Daneshtalab, Masoud
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. George Angelos Papadopoulos, University of Cyprus, Cyprus.
    Preface from the Chairs2016Ingår i: Proceedings - 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2016, Institute of Electrical and Electronics Engineers Inc. , 2016, s. xv-xvi, artikel-id 7445303Konferensbidrag (Refereegranskat)
  • 39. Cui, L.
    et al.
    Liu, X.
    Wu, F.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Xie, C.
    A Low Bit-Width LDPC Min-Sum Decoding Scheme for NAND Flash2022Ingår i: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 41, nr 6, s. 1971-1975Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    For NAND flash memory, designing a good low-density parity-check (LDPC) decoding algorithm could ensure data reliability. When the decoding algorithm is implemented in hardware, it is necessary to achieve attractive trade off between implementation complexity and decoding performance. In this paper, a novel low bit-width decoding scheme is introduced. In this scheme, the Quasi-Cyclic LDPC (QC-LDPC) is used, and the row-layered normalized min-sum algorithm is improved by restricting the amplitude of minimum and second-minimum values in each check node (CN) updating. The simulation shows that our approach achieves a lower UBER (Uncorrectable Bit Error Rate) with a negligible increase in computational complexity, especially with low precision input log-likelihood ratio (LLR).

  • 40.
    Daneshtalab, Masoud
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Ejlali, A.
    Kargahi, M.
    Special section on design for resilience in cyber-physical systems2018Ingår i: CSI International Symposium on Real-Time and Embedded Systems and Technologies, RTEST 2018, Institute of Electrical and Electronics Engineers (IEEE) , 2018Konferensbidrag (Refereegranskat)
  • 41. Dasari, D.
    et al.
    Becker, Matthias
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Casini, D.
    Blas, T.
    End-to-End Analysis of Event Chains under the QNX Adaptive Partitioning Scheduler2022Ingår i: Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, RTAS, Institute of Electrical and Electronics Engineers (IEEE) , 2022, s. 214-227Konferensbidrag (Refereegranskat)
    Abstract [en]

    Modern autonomous cars run classic AUTOSAR applications alongside advanced driving assistance systems on a single-vehicle computer. Ensuring safety and predictability in such a complex system is challenging and requires temporal isolation between the various components. A promising solution is the POSIX-compliant QNX operating system: it meets the automotive standards for functional safety at the highest level (ISO 26262 ASIL-D) and provides temporal isolation through the Adaptive Partitioning Scheduler (APS), a resource reservation algorithm that guarantees processor bandwidth to groups of threads. These guarantees make it an ideal platform for composing diverse and complex applications on centralized vehicle computers. However, so far, there is no precise description or analysis of the APS reservation mechanism in real-time literature. In this paper, we provide the first description of the behavior of the APS from a real-time point of view and validate the results by running experiments on a real QNX platform. Based on the derived scheduler rules, we develop a response-time analysis to bound the end-to-end latency of event chains under APS. Finally, we evaluate different design strategies on a case study based on a real autonomous construction vehicle. 

  • 42. Delmas, M.
    et al.
    Höglund, L.
    Ivanov, R.
    Naureen, S.
    Ramos Santesmases, David
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. IRnova AB, Electrum 236 - C5, Kista, SE-164 40, Sweden.
    Evans, D.
    Becanovic, S.
    Almqvist, S.
    Rihtnesberg, D.
    Fattala, S.
    Smuk, S.
    Costard, E.
    HOT MWIR T2SL detectors to reduce system: Size, weight, and power2021Ingår i: Sensors, Systems, and Next-Generation Satellites XXV, SPIE - International Society for Optical Engineering, 2021, Vol. 11858, artikel-id 118580ZKonferensbidrag (Refereegranskat)
    Abstract [en]

    In 2019, IRnova launched a full-scale production of a reduced size, weight and power integrated detector dewar cooler assemblies (Oden MW; VGA format with 15 μm pixel pitch) covering the full mid-wavelength infrared spectral domain (3.7 μm - 5.1 μm). Oden MW exhibits excellent performance with operating temperatures up to 110 K at F/5.5 with typical values of temporal and spatial noise equivalent temperature of 22 mK and 7 mK, respectively, and an operability higher than 99.85%. More recently, IRnova developed a new detector design with a cut-off wavelength of 5.3 μm which can potentially allow an operating temperature of the detector up to 150K with excellent performance demonstrated on single pixels with a quantum efficiency as high as 46% at 4 μm without antireflection coating, a turn on bias lower than -100 mV and a dark current density as low as 8 × 10-6 A/cm2, which is a factor of < 5 higher than Rule07. The dark current was also found independent of the device size ranging from 10 μm to 223 μm indicating that surface leakage currents are not limiting the dark current. The achievable operating temperature of an FPA made of this new detector design has been estimated to be <150 K with F/5.5 optics. These outstanding results demonstrate that this new generation of detector design is an excellent candidate for future high operating temperature and high-definition focal plane array.

  • 43.
    Delmas, Marie
    et al.
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Ramos Santesmases, David
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Ivanov, Ruslan
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Zurauskaite, Laura
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Evans, Dean
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Rihtnesberg, David
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Almqvist, Susanne
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Becanovic, Smilja
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Costard, Eric
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    Hoglund, Linda
    IRnova AB, Isafjordsgatan 22, S-16440 Kista, Sweden..
    High performance type-II InAs/GaSb superlattice infrared photodetectors with a short cut-off wavelength2023Ingår i: Opto-Electronics Review, ISSN 1230-3402, E-ISSN 1896-3757, Vol. 31, nr 1, artikel-id 144555Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This work investigates the potential of InAs/GaSb superlattice detectors for the short -wavelength infrared spectral band. A barrier detector structure was grown by molecular beam epitaxy and devices were fabricated using standard photolithography techniques. Optical and electrical characterisations were carried out and the current limitations were identified. The authors found that the short diffusion length of similar to 1.8 mu m is currently limiting the quantum efficiency (double-pass, no anti-reflection coating) to 43% at 2.8 mu m and 200 K. The dark current density is limited by the surface leakage current which shows generation-recombination and diffusion characters below and above 195 K, respectively. By fitting the size dependence of the dark current, the bulk values have been estimated to be 6.57 center dot 10(-6)A/cm(2) at 200 K and 2.31 center dot 10(-6) A/cm(2) at 250 K, which is only a factor of 4 and 2, respectively, above the Rule07.

  • 44. Dhaou, I. B.
    et al.
    Kondoro, Aron
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Kelati, Amleset
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system. KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Integrerade komponenter och kretsar.
    Rwegasira, Diana
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Naiman, S.
    Mvungi, N. H.
    Tenhunen, Hannu
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Integrerade komponenter och kretsar.
    Communication and security technologies for smart grid2017Ingår i: International Journal of Embedded and Real-Time Communication Systems, ISSN 1947-3176, E-ISSN 1947-3184, Vol. 8, nr 2, s. 40-65Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The smart grid is a new paradigm that aims to modernize the legacy power grid. It is based on the integration of ICT technologies, embedded system, sensors, renewable energy and advanced algorithms for management and optimization. The smart grid is a system of systems in which communication technology plays a vital role. Safe operations of the smart grid need a careful design of the communication protocols, cryptographic schemes, and computing technology. In this article, the authors describe current communication technologies, recently proposed algorithms, protocols, and architectures for securing smart grid communication network. They analyzed in a unifying approach the three principles pillars of smart-gird: Sensors, communication technologies, and security. Finally, the authors elaborate open issues in the smart-grid communication network.

  • 45. Du, G.
    et al.
    Yang, Z.
    Li, Z.
    Zhang, D.
    Yin, Y.
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    NR-MPA: Non-recovery compression based multi-path packet-connected-circuit architecture of convolution neural networks accelerator2019Ingår i: Proceedings - 2019 IEEE International Conference on Computer Design, ICCD 2019, Institute of Electrical and Electronics Engineers (IEEE) , 2019, s. 173-176, artikel-id 8988763Konferensbidrag (Refereegranskat)
    Abstract [en]

    Convolution Neural Networks (CNNs) involve massive data to be calculated and stored. To meet the challenges above, parallel hardware accelerators consisting of hundreds of Processing Elements (PEs) arranged as a many-core systemon-chip, connected by a Network-on-Chip (NoC) are proposed, which achieve high throughput exploiting parallel PE array. However, most of existing accelerators focus on only one aspect, such as compute structure of PE and data movement overhead above NoC, which causes the throughout, area and latency of the accelerator not fully optimized. In this paper, we propose an efficient general purpose CNN accelerator including both compute based on Non-Recovery Compression (NRC) method and data movement by novel Multi-Paths Packet Connection Circuit (MP-PCC). NRC can save computation time due to zero multiplier through shift decoding in PE and improve power efficiency by saving a large number of data transmission. MPPCC, evolved from Packet Connection Circuit, supports single and multicast transmission modes at the same time, and changes the multicast (X, Y) routing algorithm to multicast Y algorithm to improve the transmission efficiency. The proposed architecture which was implemented on Xilinx FPGA achieves 17.7x faster computation speed and 2.2x fewer memory accesses compared with the state-of-the-art method.

  • 46.
    Du, Gaoming
    et al.
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Liu, Guanyu
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Li, Zhenmin
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Cao, Yifan
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Zhang, Duoli
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Ouyang, Yiming
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Gao, Minglun
    Hefei Univ Technol, 193 Tunxi Rd, Hefei, Anhui, Peoples R China..
    Lu, Zhonghai
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    SSS: Self-aware System-on-chip Using a Static-dynamic Hybrid Method2019Ingår i: ACM Journal on Emerging Technologies in Computing Systems, ISSN 1550-4832, E-ISSN 1550-4840, Vol. 15, nr 3, artikel-id 28Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Network-on-Chip (NoC) has become the de facto communication standard for multi-core or many-core System-on-Chip (SoC) due to its scalability and flexibility. However, an important factor in NoC design is temperature, which affects the overall performance of SoC-decreasing circuit frequency, increasing energy consumption, and even shortening chip lifetime. In this article, we propose SSS, a self-aware SoC using a static-dynamic hybrid method that combines dynamic mapping and static mapping to reduce the hotspot temperature for NoC-based SoCs. First, we propose monitoring and thermal modeling for self-state sensoring. Then, in static mapping stage, we calculate the optimal mapping solutions under different temperature modes using the discrete firefly algorithm to help self-decisionmaking. Finally, in dynamic mapping stage, we achieve dynamic mapping through configuring NoC and SoC sentient units for self-optimizing. Experimental results show that SSS has substantially reduced the peak temperature by up to 37.52%. The FPGA prototype proves the effectiveness and smartness of SSS in reducing hotspot temperature.

  • 47.
    Dubrova, Elena
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    A reconfigurable arbiter PUF with 4 x 4 switch blocks2018Ingår i: Proceedings of The International Symposium on Multiple-Valued Logic, IEEE Computer Society , 2018, s. 31-37Konferensbidrag (Refereegranskat)
    Abstract [en]

    Physical Unclonable Functions (PUFs) exploit manufacturing process variation to create responses that are unique to individual integrated circuits (ICs). Typically responses of a PUF cannot be modified once the PUF is fabricated. In applications which use PUFs as a long-Term secret key, it would be useful to have a simple mechanism for reconfiguring the PUF in order to update the key periodically. In this paper, we present a new type of arbiter PUFs which use 4 x 4 switch blocks instead of the conventional 2 x 2 ones. Each 4 x 4 switch block can be reconfigured in many different ways during the PUF's lifetime, making possible regular key updates. 

  • 48.
    Dubrova, Elena
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Ngo, Kalle
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Gärtner, Joel
    KTH.
    Breaking a Fifth-Order Masked Implementation of CRYSTALS-Kyber by Copy-PasteManuskript (preprint) (Övrigt vetenskapligt)
    Abstract [en]

    CRYSTALS-Kyber has been selected by the NIST as a public-key encryption and key encapsulation mechanism to be standardized. It is also included in the NSA's suite of cryptographic algorithms recommended for national security systems. This makes it  important to evaluate the resistance of CRYSTALS-Kyber's implementations to side-channel attacks. The unprotected and first-order masked software implementations have been already analysed. In this paper, we present deep learning-based message recovery attacks on the $\omega$-order masked implementations of CRYSTALS-Kyber in ARM Cortex-M4 CPU for $\omega \leq 5$. The main contribution is a new neural network training method called {\em recursive learning}. In the attack on an $\omega$-order masked implementation, we start training from an artificially constructed neural network $M^{\omega}$ whose weights are partly copied from a model $M^{\omega-1}$ trained on the $(\omega-1)$-order masked implementation, and then extended to one more share. Such a method allows us to train neural networks that can recover a message bit with the probability above 99\% from high-order masked implementations. 

    Ladda ner fulltext (pdf)
    fulltext
  • 49.
    Dubrova, Elena
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Ngo, Kalle
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Gärtner, Joel
    KTH, Skolan för teknikvetenskap (SCI), Matematik (Inst.).
    Wang, Ruize
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system.
    Breaking a Fifth-Order Masked Implementation of CRYSTALS-Kyber by Copy-Paste2023Ingår i: PROCEEDINGS OF THE 10TH ACM ASIA PUBLIC-KEY CRYPTOGRAPHY WORKSHOP, APKC 2023, Association for Computing Machinery (ACM) , 2023, s. 10-20Konferensbidrag (Refereegranskat)
    Abstract [en]

    CRYSTALS-Kyber has been selected by the NIST as a public-key encryption and key encapsulation mechanism to be standardized. It is also included in the NSA's suite of cryptographic algorithms recommended for national security systems. This makes it important to evaluate the resistance of CRYSTALS-Kyber's implementations to side-channel attacks. The unprotected and first-order masked software implementations have been already analysed. In this paper, we present deep learning-based message recovery attacks on the omega-order masked implementations of CRYSTALS-Kyber in ARM Cortex-M4 CPU for omega <= 5. The main contribution is a new neural network training method called recursive learning. In the attack on an omega-order masked implementation, we start training from an artificially constructed neural network M-omega whose weights are partly copied from a model M omega-1 trained on the (omega - 1)-order masked implementation, and then extended to one more share. Such a method allows us to train neural networks that can recover a message bit with the probability above 99% from high-order masked implementations.

  • 50.
    Dubrova, Elena
    et al.
    KTH, Skolan för elektroteknik och datavetenskap (EECS), Elektroteknik, Elektronik och inbyggda system, Elektronik och inbyggda system.
    Selander, G.
    Näslund, Mats
    KTH.
    Lindqvist, Fredrik
    KTH.
    Lightweight message authentication for constrained devices2018Ingår i: WiSec 2018 - Proceedings of the 11th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Association for Computing Machinery (ACM), 2018, s. 196-201Konferensbidrag (Refereegranskat)
    Abstract [en]

    Message Authentication Codes (MACs) used in today's wireless communication standards may not be able to satisfy resource limitations of simpler 5G radio types and use cases such as machine type communications. As a possible solution, we present a lightweight message authentication scheme based on the cyclic redundancy check (CRC). It has been previously shown that a CRC with an irreducible generator polynomial as the key is an -almost XOR-universal (AXU) hash function with = (m + n)/2n-1, where m is the message size and n is the CRC size. While the computation of n-bit CRCs can be efficiently implemented in hardware using linear feedback shift registers, generating random degree-n irreducible polynomials is computationally expensive for large n. We propose using a product of k irreducible polynomials whose degrees sum up to n as a generator polynomial for an n-bit CRC and show that the resulting hash functions are -AXU with = (m + n)k/2n -k. The presented message authentication scheme can be seen as providing a trade-off between security and implementation efficiency.

1234 1 - 50 av 198
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf