kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 32) Show all publications
Xu, J., Zheng, Y., Shen, R., Wang, R., Li, J., Wang, D., . . . Hemani, A. (2026). SIMBRAIN: A nonidealities-aware simulation framework for spiking neural networks based on memristor crossbars. Neurocomputing, 665, Article ID 132107.
Open this publication in new window or tab >>SIMBRAIN: A nonidealities-aware simulation framework for spiking neural networks based on memristor crossbars
Show others...
2026 (English)In: Neurocomputing, ISSN 0925-2312, E-ISSN 1872-8286, Vol. 665, article id 132107Article in journal (Refereed) Published
Abstract [en]

Memristor crossbars have emerged as a promising computing paradigm for neural networks (NNs), excelling in executing multiply-accumulate (MAC) operations for artificial neural networks (ANNs). However, crossbar architectures still face challenges in handling the complex nonlinear cognitive functions of trace dynamics, which have become one of the most energy-intensive and memory-demanding aspects of implementing brain-inspired spiking neural networks (SNNs). Furthermore, the nonidealities of memristors remain a critical concern. Their impact on different NNs, especially biologically-plausible SNNs, is still largely underexplored. While prior studies have proposed device-to-algorithm simulation frameworks that incorporate these nonidealities, efforts are still needed to bridge the gap between raw device data and ready-to-use memristor models. Therefore, this work introduces SIMBRAIN, an open-source device-to-network simulation framework that incorporates an all-in-one model and fitting acceleration strategies for translating device nonidealities into an integrated behavioral model. SIMBRAIN proposes a novel mapping strategy that first extends conventional crossbars to perform nonlinear cognitive functions for trace dynamics. Validated on two memristors, SIMBRAIN rapidly delivers both realistic accuracy results and circuit-level performance metrics by addressing batch processing challenges considering nonidealities. This work also systematically quantifies the impact of memristor nonidealities on trace-SNN performance compared to ANNs.

Place, publisher, year, edition, pages
Elsevier BV, 2026
Keywords
Simulation framework, Trace-STDP, Reconfigurable crossbar, Nonidealities, Spiking neural network (SNN)
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-376680 (URN)10.1016/j.neucom.2025.132107 (DOI)001631091900001 ()2-s2.0-105024884281 (Scopus ID)
Note

QC 20260223

Available from: 2026-02-23 Created: 2026-02-23 Last updated: 2026-02-23Bibliographically approved
Xu, J., Zheng, Y., Stathis, D., Wang, R., Shen, R., Zheng, L.-R., . . . Hemani, A. (2025). MemMIMO: A Simulation Framework for Memristor-Based Massive MIMO Acceleration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(11), 4327-4340
Open this publication in new window or tab >>MemMIMO: A Simulation Framework for Memristor-Based Massive MIMO Acceleration
Show others...
2025 (English)In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, ISSN 0278-0070, E-ISSN 1937-4151, Vol. 44, no 11, p. 4327-4340Article in journal (Refereed) Published
Abstract [en]

Memristor-based crossbar architectures have proven highly effective for matrix vector multiplication (MVM) operations, making them a promising solution for accelerating the MVMs widely used in precoding algorithms for multiple-input-multiple-output (MIMO) wireless communication systems. However, real-world implementation of memristor-based computing systems face challenges due to common nonidealities in both the devices and the peripheral circuits. To facilitate a rapid design flow and investigate the impact of nonidealities, an integrated open-source simulation framework MemMIMO is developed. The simulation framework estimates the accuracy and hardware performance of the computing system, offering a variety of flexible design options. MemMIMO integrates a behavioral model of the mix-signal architecture with a digital front-end. There are three major building blocks in MemMIMO: 1) the device fitting block; 2) the mapping block; and 3) the performance estimation block. These blocks work together to map the complex MVMs in precoding algorithms for MIMO systems to crossbar-based architectures that incorporate memristor models characterized by physical device behavior. Using two typical use cases targeting six-generation (6G) massive MIMO communication as case studies, MemMIMO is used to model different memristor devices, explore the impact of nonidealities on system accuracy, and benchmark circuit-level performance metrics, including area, speed, and power.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
Precoding, Memristors, Computer architecture, Vectors, Massive MIMO, Integrated circuit modeling, Transceivers, Symbols, Array signal processing, OFDM, Complex matrix vector multiplication (MVM), memristor crossbar, mixed-signal behavior circuit model, multiple-input-multiple-output (MIMO), nonidealities
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-375101 (URN)10.1109/TCAD.2025.3565478 (DOI)001600047600005 ()2-s2.0-105004048641 (Scopus ID)
Note

QC 20260113

Available from: 2026-01-13 Created: 2026-01-13 Last updated: 2026-01-13Bibliographically approved
Wang, D., Yan, X., Yu, Y., Stathis, D., Hemani, A., Lansner, A., . . . Zou, Z. (2025). Scalable Multi-FPGA HPC Architecture for Associative Memory System. IEEE Transactions on Biomedical Circuits and Systems, 19(2), 454-468
Open this publication in new window or tab >>Scalable Multi-FPGA HPC Architecture for Associative Memory System
Show others...
2025 (English)In: IEEE Transactions on Biomedical Circuits and Systems, ISSN 1932-4545, E-ISSN 1940-9990, Vol. 19, no 2, p. 454-468Article in journal (Refereed) Published
Abstract [en]

Associative memory is a cornerstone of cognitive intelligence within the human brain. The Bayesian confidence propagation neural network (BCPNN), a cortex-inspired model with high biological plausibility, has proven effective in emulating high-level cognitive functions like associative memory. However, the current approach using GPUs to simulate BCPNN-based associative memory tasks encounters challenges in latency and power efficiency as the model size scales. This work proposes a scalable multi-FPGA high performance computing (HPC) architecture designed for the associative memory system. The architecture integrates a set of hypercolumn unit (HCU) computing cores for intra-board online learning and inference, along with a spike-based synchronization scheme for inter-board communication among multiple FPGAs. Several design strategies, including population-based model mapping, packet-based spike synchronization, and cluster-based timing optimization, are presented to facilitate the multi-FPGA implementation. The architecture is implemented and validated on two Xilinx Alveo U50 FPGA cards, achieving a maximum model size of 200x10 and a peak working frequency of 220 MHz for the associative memory system. Both the memory-bounded spatial scalability and compute-bounded temporal scalability of the architecture are evaluated and optimized, achieving a maximum scale-latency ratio (SLR) of 268.82 for the two-FPGA implementation. Compared to a two-GPU counterpart, the two-FPGA approach demonstrates a maximum latency reduction of 51.72x and a power reduction exceeding 5.28x under the same network configuration. Compared with the state-of-the-art works, the two-FPGA implementation exhibits a high pattern storage capacity for the associative memory task.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
Keywords
multi-FPGA, scalability, Associative memory, high performance computing (HPC), spiking neural network (SNN), Bayesian confidence propa-gation neural network (BCPNN)
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:kth:diva-363869 (URN)10.1109/TBCAS.2024.3446660 (DOI)001458211300017 ()39163180 (PubMedID)2-s2.0-85201785533 (Scopus ID)
Note

QC 20250526

Available from: 2025-05-26 Created: 2025-05-26 Last updated: 2025-05-26Bibliographically approved
Pudi, D., Yu, Y., Stathis, D., Prajapati, S. K., Boppu, S., Hemani, A. & Cenkeramaddi, L. R. (2024). Application Level Synthesis: Creating Matrix-Matrix Multiplication Library: A Case Study. IEEE Access, 12, 155885-155903
Open this publication in new window or tab >>Application Level Synthesis: Creating Matrix-Matrix Multiplication Library: A Case Study
Show others...
2024 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 12, p. 155885-155903Article in journal (Refereed) Published
Abstract [en]

Efficiently synthesizing an entire application that consists of multiple algorithms for hardware implementation is a very difficult and unsolved problem. One of the main challenges is the lack of good algorithmic libraries. A good algorithmic library should contain algorithmic implementations that can be physically composable, and their cost metrics can be accurately predictable. Physical composability and cost predictability can be achieved using a novel framework called SiLago. By physically abutting small hardware blocks together like Lego bricks, the SiLago framework can eliminate the time-consuming logic and physical synthesis and immediately give post-layout accurate cost estimation. In this paper, we build a library for matrix-matrix multiplication algorithm based on the SiLago framework as a case study because matrix-matrix multiplication is a fundamental operation in scientific computing that is frequently found in applications such as signal processing, image processing, pattern recognition, robotics, and so on. This paper demonstrates the methodology to construct such a library containing composable and predictable algorithms so that the application-level synthesis tools can utilize it to explore the design space for an entire application. Specifically, in this paper, we present an algorithm for matrix decomposition, several mapping strategies for selected kernel functions, an algorithm to construct the mapping of each matrix-matrix multiplication, and finally, the method to calculate the cost estimation of each solution.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Coarse grain reconfigurable architecture, field programmable gate array (FPGA), dynamically reconfigurable resource array (DRRA), distributed memory architecture (DiMArch), matrix multiplication, high-level synthesis, hardware accelerators, hardware-software co-design
National Category
Embedded Systems
Identifiers
urn:nbn:se:kth:diva-356494 (URN)10.1109/ACCESS.2024.3484175 (DOI)001347210500001 ()2-s2.0-85207469544 (Scopus ID)
Note

QC 20241115

Available from: 2024-11-15 Created: 2024-11-15 Last updated: 2024-11-15Bibliographically approved
Yousefzadeh, S., Yu, Y., Peter, A., Stathis, D. & Hemani, A. (2024). Exploration of Custom Floating-Point Formats: A Systematic Approach. In: Proceedings - 2024 27th Euromicro Conference on Digital System Design, DSD 2024: . Paper presented at 27th Euromicro Conference on Digital System Design, DSD 2024, Paris, France, Aug 28 2024 - Aug 30 2024 (pp. 266-273). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Exploration of Custom Floating-Point Formats: A Systematic Approach
Show others...
2024 (English)In: Proceedings - 2024 27th Euromicro Conference on Digital System Design, DSD 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 266-273Conference paper, Published paper (Refereed)
Abstract [en]

The remarkable advancements in AI algorithms over the past three decades have been paralleled by an exponential growth in their complexity, with parameter counts soaring from 60,000 in LeNet during the late 1980s to a staggering 175 billion in ChatGPT 3.0. To mitigate this surge in memory footprint, approximate computing has emerged as a promising strategy, focusing on deploying the minimal resolution necessary to maintain acceptable accuracy. Yet, current practices are hindered by two major challenges: a) the process of identifying the optimal resolution and representation format for each tensor remains a manual, ad hoc task, and b) the representation, typically in floating point (FP) format, is confined to standardized norms predominantly supported by commercial-off-the-shelf (COTS) products like GPUs. This paper tackles these issues by introducing a systematic approach to exploring the FP representation design space to find the ideal FP format for each tensor, thereby leveraging the full potential of FP quantization techniques. It is designed for custom hardware, enabling access to arbitrary FP formats, but also allows users to limit their exploration to standard FP formats, making it compatible with COTS. Additionally, the proposed method explores the Block Floating-Point (BFP) and automatically decides on the size of the blocks. A heuristic-based search method is proposed to handle the large design space. The proposed approach is general, and the heuristic is not biased towards any specific category of algorithms. We apply this method to a Self-Organizing Map (SOM) for bacterial genome identification and LeNet-5 neural network, demonstrating a significant reduction in memory footprint by around 94% and 96%, respectively, compared to the conventional 32-bit FP baseline.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
approximate computing, block floating point, design space exploration, floating point, quantization
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-358137 (URN)10.1109/DSD64264.2024.00043 (DOI)001414927800034 ()2-s2.0-85211894711 (Scopus ID)
Conference
27th Euromicro Conference on Digital System Design, DSD 2024, Paris, France, Aug 28 2024 - Aug 30 2024
Note

 Part of ISBN 9798350380385

QC 20250115

Available from: 2025-01-07 Created: 2025-01-07 Last updated: 2026-03-09Bibliographically approved
Wang, D., Wang, Y., Yang, Y., Stathis, D., Hemani, A., Lansner, A., . . . Zou, Z. (2024). FPGA-Based HPC for Associative Memory System. In: 29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024: . Paper presented at 29th Asia and South Pacific Design Automation Conference (ASP-DAC), JAN 22-25, 2024, BrainKorea Four 21, Incheon, SOUTH KOREA (pp. 52-57). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>FPGA-Based HPC for Associative Memory System
Show others...
2024 (English)In: 29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 52-57Conference paper, Published paper (Refereed)
Abstract [en]

Associative memory plays a crucial role in the cognitive capabilities of the human brain. The Bayesian Confidence Propagation Neural Network (BCPNN) is a cortex model capable of emulating brain-like cognitive capabilities, particularly associative memory. However, the existing GPU-based approach for BCPNN simulations faces challenges in terms of time overhead and power efficiency. In this paper, we propose a novel FPGA-based high performance computing (HPC) design for the BCPNN-based associative memory system. Our design endeavors to maximize the spatial and timing utilization of FPGA while adhering to the constraints of the available hardware resources. By incorporating optimization techniques including shared parallel computing units, hybrid-precision computing for a hybrid update mechanism, and the globally asynchronous and locally synchronous (GALS) strategy, we achieve a maximum network size of 150x10 and a peak working frequency of 100 MHz for the BCPNN-based associative memory system on the Xilinx Alveo U200 Card. The tradeoff between performance and hardware overhead of the design is explored and evaluated. Compared with the GPU counterpart, the FPGA-based implementation demonstrates significant improvements in both performance and energy efficiency, achieving a maximum latency reduction of 33.25x, and a power reduction of over 6.9x, all while maintaining the same network configuration.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Series
Asia and South Pacific Design Automation Conference Proceedings, ISSN 2153-6961
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:kth:diva-346310 (URN)10.1109/ASP-DAC58780.2024.10473880 (DOI)001196002900009 ()2-s2.0-85189308319 (Scopus ID)
Conference
29th Asia and South Pacific Design Automation Conference (ASP-DAC), JAN 22-25, 2024, BrainKorea Four 21, Incheon, SOUTH KOREA
Note

QC 20240513

Part of ISBN 979-8-3503-9354-5

Available from: 2024-05-13 Created: 2024-05-13 Last updated: 2024-07-23Bibliographically approved
Xu, J., Zheng, Y., Li, F., Stathis, D., Shen, R., Chu, H., . . . Hemani, A. (2024). Modeling Cycle-to-Cycle Variation in Memristors for In-Situ Unsupervised Trace-STDP Learning. IEEE Transactions on Circuits and Systems - II - Express Briefs, 71(2), 627-631
Open this publication in new window or tab >>Modeling Cycle-to-Cycle Variation in Memristors for In-Situ Unsupervised Trace-STDP Learning
Show others...
2024 (English)In: IEEE Transactions on Circuits and Systems - II - Express Briefs, ISSN 1549-7747, E-ISSN 1558-3791, Vol. 71, no 2, p. 627-631Article in journal (Refereed) Published
Abstract [en]

Evaluating the computational accuracy of Spiking Neural Network (SNN) implemented as in-situ learning on large-scale memristor crossbars remains a challenge due to the lack of a versatile model for the variations in non-ideal memristors. This brief proposes a novel behavioral variation model along with a four-stage pipeline for physical memristors. The proposed variation model combines both absolute and relative variations. Therefore, it can better characterize different memristor cycle-to-cycle (C2C) variations in practice. The proposed variation model has been used to simulate the behavior of two physical memristors. Adopting the non-ideal memristor model, the trace-based spiking-timing dependent plasticity (STDP) unsupervised in-memristor learning system is simulated. Although the synaptic-level weight simulation shows a performance degradation of 7.99% and 4.07% increase in the relative root mean square error (RRMSE), the network-level simulation results show no accuracy loss on the MNIST benchmark. Furthermore, the impacts of absolute and relative C2C variations on network performance are simulated and analyzed through two sets of univariate experiments.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Memristors, Correlation, Integrated circuit modeling, Behavioral sciences, Mathematical models, Computational modeling, Task analysis, Memristor, non-ideality, variation model, trace-based STDP, in-situ unsupervised learning
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:kth:diva-345567 (URN)10.1109/TCSII.2023.3309329 (DOI)001167527900030 ()2-s2.0-85169689021 (Scopus ID)
Note

QC 20240412

Available from: 2024-04-12 Created: 2024-04-12 Last updated: 2024-04-12Bibliographically approved
Wang, D., Xu, J., Li, F., Zhang, L., Cao, C., Stathis, D., . . . Zou, Z. (2023). A Memristor-Based Learning Engine for Synaptic Trace-Based Online Learning. IEEE Transactions on Biomedical Circuits and Systems, 17(5), 1153-1165
Open this publication in new window or tab >>A Memristor-Based Learning Engine for Synaptic Trace-Based Online Learning
Show others...
2023 (English)In: IEEE Transactions on Biomedical Circuits and Systems, ISSN 1932-4545, E-ISSN 1940-9990, Vol. 17, no 5, p. 1153-1165Article in journal (Refereed) Published
Abstract [en]

The memristor has been extensively used to facilitate the synaptic online learning of brain-inspired spiking neural networks (SNNs). However, the current memristor-based work can not support the widely used yet sophisticated trace-based learning rules, including the trace-based Spike-Timing-Dependent Plasticity (STDP) and the Bayesian Confidence Propagation Neural Network (BCPNN) learning rules. This paper proposes a learning engine to implement trace-based online learning, consisting of memristor-based blocks and analog computing blocks. The memristor is used to mimic the synaptic trace dynamics by exploiting the nonlinear physical property of the device. The analog computing blocks are used for the addition, multiplication, logarithmic and integral operations. By organizing these building blocks, a reconfigurable learning engine is architected and realized to simulate the STDP and BCPNN online learning rules, using memristors and 180 nm analog CMOS technology. The results show that the proposed learning engine can achieve energy consumption of 10.61 pJ and 51.49 pJ per synaptic update for the STDP and BCPNN learning rules, respectively, with a 147.03× and 93.61× reduction compared to the 180 nm ASIC counterparts, and also a 9.39× and 5.63× reduction compared to the 40 nm ASIC counterparts. Compared with the state-of-the-art work of Loihi and eBrainII, the learning engine can reduce the energy per synaptic update by 11.31× and 13.13× for trace-based STDP and BCPNN learning rules, respectively.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Keywords
Bayesian confidence propagation neural network (BCPNN), learning engine, memristor, online learning, spike-timing-dependent plasticity (STDP), spiking neural network (SNN), trace dynamics
National Category
Biomedical Laboratory Science/Technology
Identifiers
urn:nbn:se:kth:diva-349842 (URN)10.1109/TBCAS.2023.3291021 (DOI)001122543600001 ()37390002 (PubMedID)2-s2.0-85163535883 (Scopus ID)
Note

QC 20240703

Available from: 2024-07-03 Created: 2024-07-03 Last updated: 2024-07-03Bibliographically approved
Stathis, D., Chaourani, P., Syed, J. & Hemani, A. (2023). Clock tree generation by abutment in synchoros VLSI design. Microprocessors and microsystems, 102, Article ID 104913.
Open this publication in new window or tab >>Clock tree generation by abutment in synchoros VLSI design
2023 (English)In: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 102, article id 104913Article in journal (Refereed) Published
Abstract [en]

Synchoros VLSI design style has been proposed as an alternative to standard cell-based design. Standard cells are replaced by synchoros, large grain, VLSI design objects called SiLago (Silicon Lego) blocks. This new design style eliminates the need to synthesise ad hoc wires of any type: functional and infrastructural. SiLago blocks are organised into region instances. In a region instance, communication amongst SiLago blocks is synchronous and happens over a regional network on chip (NoC), whose fragments are also absorbed into SiLago blocks. Consequently, the regional NoCs get created by the abutment of SiLago blocks. The clock tree used in a region is called a regional clock tree (RCT). The synchoros VLSI design style requires that the RCT, like the regional NoCs, is also created by abutting its fragments. The RCT fragments are absorbed within the SiLago blocks. The RCT created by the abutment is not an ad-hoc clock tree but a structured and predictable design with known cost metrics. The design of such an RCT is the focus of this paper. The scheme is scalable, and we demonstrate that the proposed RCT can be generated for valid VLSI designs of ∼1.5 million gates. The RCT created by abutment is correct by construction, and its properties are predictable. Additionally, we present an in-depth description of the method used to find the optimal configuration for the proposed design. We have validated the generated RCTs with static timing analysis to validate the correct-by-construction claim. Finally, we show that the cost metrics of the SiLago RCT is comparable to the one generated by commercial EDA tools.

Place, publisher, year, edition, pages
Elsevier BV, 2023
Keywords
CTS, EDA, SiLago, VLSI design, Synchoricity
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Embedded Systems
Identifiers
urn:nbn:se:kth:diva-338395 (URN)10.1016/j.micpro.2023.104913 (DOI)001100236200001 ()2-s2.0-85172294458 (Scopus ID)
Note

Not duplicate with DiVA 1655367

QC 20231024

Available from: 2023-10-24 Created: 2023-10-24 Last updated: 2023-12-11Bibliographically approved
Kallapu, R., Stathis, D., Hoppe, S. & Hemani, A. (2023). DRRA-based Reconfigurable Architecture for Mixed-Radix FFT. In: Proceedings of the IEEE International Conference on VLSI Design: . Paper presented at 36th International Conference on VLSI Design, VLSID 2023, Hyderabad, India (pp. 25-30). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>DRRA-based Reconfigurable Architecture for Mixed-Radix FFT
2023 (English)In: Proceedings of the IEEE International Conference on VLSI Design, Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 25-30Conference paper, Published paper (Refereed)
Abstract [en]

Fast-Fourier Transform is an important algorithm which is used in digital signal processing and communication applications. Furthermore, mixed-radix FFT provides flexibility and increases the speed of FFT computation. For real-time processing, efficient hardware implementation using reconfigurable architectures is preferred which can offer higher performance and flexibility. In this paper, we propose an architecture for the implementation of the FFT that is derived from the Dynamically Reconfigurable Resource Array and has multiple parallel processing cells while also providing the flexibility to select the radix for each stage of the FFT. The twiddle factor generator proposed in this architecture minimizes the memory requirements and simplifies the hardware. Using the proposed architecture, various length FFTs were mapped onto either single cell or multiple cells in parallel. It is observed that the proposed architecture improves the performance by 2x times when compared to the existing FFT architectures.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
International Conference on VLSI Design, ISSN 1063-9667
Keywords
Fast Fourier Transform, FFT butterfly radix 2 & 4, Coarse Grain Reconfigurable Architectures, Field Programmable Gate Array
National Category
Embedded Systems
Identifiers
urn:nbn:se:kth:diva-329456 (URN)10.1109/VLSID57277.2023.00020 (DOI)000987761500005 ()2-s2.0-85153861348 (Scopus ID)
Conference
36th International Conference on VLSI Design, VLSID 2023, Hyderabad, India
Note

QC 20230621

Available from: 2023-06-21 Created: 2023-06-21 Last updated: 2023-06-26Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-5697-4272

Search in DiVA

Show all publications