kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cross-modality sub-image retrieval using contrastive multimodal image representations
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems, Medical Imaging. Department of Information Technology, Uppsala University, 751 05, Uppsala, Sweden; Department of Biomedical Engineering and Health Systems, Royal Institute of Technology, 141 52, Stockholm, Sweden.ORCID iD: 0000-0003-3147-5626
Department of Information Technology, Uppsala University, 751 05, Uppsala, Sweden; Department of Physics and Technology, UiT The Arctic University of Norway, 9037, Tromsø, Norway.
Department of Information Technology, Uppsala University, 751 05, Uppsala, Sweden.
Department of Information Technology, Uppsala University, 751 05, Uppsala, Sweden.
2024 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 14, no 1, article id 18798Article in journal (Refereed) Published
Abstract [en]

In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with robust feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: https://github.com/MIDA-group/CrossModal_ImgRetrieval.

Place, publisher, year, edition, pages
Springer Nature , 2024. Vol. 14, no 1, article id 18798
National Category
Medical Imaging Computer graphics and computer vision Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-352358DOI: 10.1038/s41598-024-68800-1ISI: 001318393400020PubMedID: 39138271Scopus ID: 2-s2.0-85201250094OAI: oai:DiVA.org:kth-352358DiVA, id: diva2:1893066
Note

QC 20241024

Available from: 2024-08-28 Created: 2024-08-28 Last updated: 2025-02-09Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Breznik, Eva

Search in DiVA

By author/editor
Breznik, Eva
By organisation
Medical Imaging
In the same journal
Scientific Reports
Medical ImagingComputer graphics and computer visionSignal Processing

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 71 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf