kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Specific Instructions Set for Neural Network Acceleration Based on Multiple RISC-V Cores
KTH, School of Electrical Engineering and Computer Science (EECS).
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Specifika instruktioner för neural Nätverksacceleration Baserat på Flera RISC-V-kärnor (Swedish)
Abstract [en]

Artificial Intelligence (AI) has been applied in next-generation channel estimation and multiple-input multiple-output (MIMO) design in cellular communication modems. The requirements on high speed, small footprint and low power consumption makes inference of even medium sized neural networks challenging. General accelerators do not achieve optimal performance, and existing instruction set architectures (ISAs) lack direct support for these specific applications. In this paper, we propose a multi-core architecture integrating a RISC-V core with a custom coprocessor to accelerate selected neural networks and define its ISA for memory access, computation and configuration. The design includes three types of tightly coupled memories (TCMs) positioned near the processing engine to realize cross-layer computation and overlap memory access cycles, addressing the memory access speed limitations in neural network computations. The dataflow, similar to the row stationary method, optimizes performance by maximizing local data reuse, thereby reducing costly data movement to main memory. The coprocessor communicates via the Rocket Custom Coprocessor (ROCC) interface with the main processor. For evaluation, a pre-trained NN is used in the Spike ISS simulator. Our design speeds up execution cycles by 36% for graph neural networks through efficient local data transfers between TCMs, achieving significantly higher performance compared to a general RISC-V core when processing 2-dimensional matrix convolution and multiplication.

Abstract [sv]

Artificiell intelligens (AI) har tillämpats i nästa generations kanalestimat och MIMO-detektering (multiple-input multiple-output) i modem för cellulära kommunikation. Kraven på hög hastighet, liten yta och låg strömförbrukning gör inferens av även medelstora neurala nätverk utmanande. Generella acceleratorer uppnår inte optimal prestanda, och befintliga instruktionsuppsättningsarkitekturer (ISA) saknar direkt stöd för dessa specifika applikationer. I detta arbete föreslår vi en flerkärnig arkitektur som integrerar en RISC-V-kärna med en anpassad co-processor för att accelerera utvalda neurala nätverk och definiera dess ISA. Designen inkluderar tre typer av tätt kopplade minnen (TCM) placerade nära kärnorna för att realisera beräkning över flera lager och överlappande minnesåtkomstcykler, för att adressera hastighetsbegränsningar för minnesåtkomst i beräkningar av neurala nätverk. Dataflödet, som liknar row stationary, optimerar prestandan genom att maximera återanvändning av lokal data, vilket minskar kostsamma dataöverföringar till huvudminnet. Co-processorn kommunicerar via ROCC-gränssnittet (Rocket Custom Coprocessor) med huvudprocessorn. Vår design accelererar exekveringscyklerna med 36% för graph neural networks (GNN) genom effektiva lokala dataöverföringar mellan olika TCM, och uppnår avsevärt högre prestanda än en allmän RISC-V-kärna vid bearbetning av 2-dimensionell matrisfaltning och multiplikation.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology , 2024. , p. 52
Series
TRITA-EECS-EX ; 2024:499
Keywords [en]
Instruction Set Architecture, RISC-V, AI Coprocessor
Keywords [sv]
Instruktionsuppsättningsarkitektur, RISC-V, AI Coprocessor
National Category
Computer Sciences Computer Engineering
Identifiers
URN: urn:nbn:se:kth:diva-352310OAI: oai:DiVA.org:kth-352310DiVA, id: diva2:1892789
External cooperation
Huawei Sweden R&D
Subject / course
Electrical Engineering
Supervisors
Examiners
Available from: 2024-09-27 Created: 2024-08-27 Last updated: 2024-09-27Bibliographically approved

Open Access in DiVA

fulltext(2910 kB)387 downloads
File information
File name FULLTEXT01.pdfFile size 2910 kBChecksum SHA-512
069dcb6b5018cfd22350e8222a0975e800dafe3890c6f2492f3e98968382d6b23fc97d018d8cd31ce539fa1cd30a77f06fe24ac71d4fb132ae1dbf4e34bf90ab
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Computer SciencesComputer Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 387 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 213 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf