kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Are Natural Domain Foundation Models Useful for Medical Image Classification?
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0009-0008-4117-1638
KTH, Centres, Science for Life Laboratory, SciLifeLab. KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0003-2920-8510
AstraZeneca, Gothenburg, Sweden.
Show others and affiliations
2024 (English)In: Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 7619-7628Conference paper, Published paper (Refereed)
Abstract [en]

The deep learning field is converging towards the use of general foundation models that can be easily adapted for diverse tasks. While this paradigm shift has become common practice within the field of natural language processing, progress has been slower in computer vision. In this paper we attempt to address this issue by investigating the transferability of various state-of-the-art foundation models to medical image classification tasks. Specifically, we evaluate the performance of five foundation models, namely Sam, Seem, Dinov2, BLIP, and OpenCLIP across four well-established medical imaging datasets. We explore different training settings to fully harness the potential of these models. Our study shows mixed results. Dinov2 consistently outperforms the standard practice of ImageNet pretraining. However, other foundation models failed to consistently beat this established baseline indicating limitations in their transferability to medical image classification tasks.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE) , 2024. p. 7619-7628
Keywords [en]
Algorithms, Algorithms, and algorithms, Applications, Biomedical / healthcare / medicine, Datasets and evaluations, formulations, Machine learning architectures
National Category
Computer Sciences Computer graphics and computer vision
Identifiers
URN: urn:nbn:se:kth:diva-350585DOI: 10.1109/WACV57701.2024.00746ISI: 001222964607075Scopus ID: 2-s2.0-85184972028OAI: oai:DiVA.org:kth-350585DiVA, id: diva2:1884793
Conference
2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, Waikoloa, United States of America, Jan 4 2024 - Jan 8 2024
Note

Part of ISBN 9798350318920

QC 20240718

Available from: 2024-07-18 Created: 2024-07-18 Last updated: 2025-12-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Huix, Joana PalésGaneshan, Adithya RajuFredin Haslum, JohanMatsoukas, ChristosSmith, Kevin

Search in DiVA

By author/editor
Huix, Joana PalésGaneshan, Adithya RajuFredin Haslum, JohanMatsoukas, ChristosSmith, Kevin
By organisation
Computational Science and Technology (CST)Science for Life Laboratory, SciLifeLab
Computer SciencesComputer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 104 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf