kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL. Sapienza Univ Rome, Dept Comp Sci, I-00198 Rome, Italy.ORCID iD: 0000-0003-2939-3007
Sapienza Univ Rome, Dept Comp Sci, I-00198 Rome, Italy.
Sapienza Univ Rome, Dept Comp Sci, I-00198 Rome, Italy.
Univ Calabria, Dept Comp Engn Modeling Elect & Syst, I-87030 Arcavacata Di Rende, Italy.
Show others and affiliations
2025 (English)In: Computer Vision-Eccv 2024 Workshops, Pt Xx / [ed] Canton, C Pont-Tuset, J DelBue, A Tommasi, T, Springer Nature , 2025, Vol. 15642, p. 53-70Conference paper, Published paper (Refereed)
Abstract [en]

This paper introduces the Neural Transcoding Vision Transformer (NT-ViT), a generative model designed to estimate highresolution functional Magnetic Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG) data. A key feature of NT-ViT is its Domain Matching (DM) sub-module which effectively aligns the latent EEG representations with those of fMRI volumes, enhancing the model's accuracy and reliability. Unlike previous methods that tend to struggle with fidelity and reproducibility of images, NT-ViT addresses these challenges by ensuring methodological integrity and higher-quality reconstructions which we showcase through extensive evaluation on two benchmark datasets; NT-ViT outperforms the current state-of-the-art by a significant margin in both cases, e.g., achieving a 10x reduction in RMSE and a 3.14x increase in SSIM on the Oddball dataset. An ablation study also provides insights into the contribution of each component to the model's overall effectiveness. This development is critical in offering a new approach to lessen the time and financial constraints typically linked with high-resolution brain imaging, thereby aiding in the swift and precise diagnosis of neurological disorders. Although it is not a replacement for actual fMRI but rather a step towards making such imaging more accessible, we believe that it represents a pivotal advancement in clinical practice and neuroscience research. Code is available at https://github.com/rom42pla/ntvit.

Place, publisher, year, edition, pages
Springer Nature , 2025. Vol. 15642, p. 53-70
Series
Lecture Notes in Computer Science, ISSN 0302-9743
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-374157DOI: 10.1007/978-3-031-91907-7_4ISI: 001544994800004Scopus ID: 2-s2.0-105014461109OAI: oai:DiVA.org:kth-374157DiVA, id: diva2:2022922
Conference
18th European Conference on Computer Vision (ECCV), Sep 29-OCT 04, 2024, Milan, Italy
Note

Part of ISBN 978-3-031-91906-0; 978-3-031-91907-7

QC 20251218

Available from: 2025-12-18 Created: 2025-12-18 Last updated: 2025-12-18Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Lanzino, RomeoMaki, Atsuto

Search in DiVA

By author/editor
Lanzino, RomeoMaki, Atsuto
By organisation
Robotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 25 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf