kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning from Offline Foundation Features with Tensor Augmentations
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0001-9437-4553
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0003-1401-3497
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0001-6204-0778
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.
Show others and affiliations
2024 (English)In: Advances in Neural Information Processing Systems 37 (NeurIPS 2024) / [ed] A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak and C. Zhang, Curran Associates , 2024Conference paper, Published paper (Refereed)
Abstract [en]

We introduce Learning from Offline Foundation Features with Tensor Augmentations (LOFF-TA), an efficient training scheme designed to harness the capabilities of foundation models in limited resource settings where their direct development is not feasible. LOFF-TA involves training a compact classifier on cached feature embeddings from a frozen foundation model, resulting in up to 37× faster training and up to 26× reduced GPU memory usage. Because the embeddings of augmented images would be too numerous to store, yet the augmentation process is essential for training, we propose to apply tensor augmentations to the cached embeddings of the original non-augmented images. LOFF-TA makes it possible to leverage the power of foundation models, regardless of their size, in settings with limited computational capacity. Moreover, LOFF-TA can be used to apply foundation models to high-resolution images without increasing compute. In certain scenarios, we find that training with LOFF-TA yields better results than directly fine-tuning the foundation model.

Place, publisher, year, edition, pages
Curran Associates , 2024.
Keywords [en]
Adaptation, Transfer learning, Foundation models, Augmentation
National Category
Computer graphics and computer vision
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-354832Scopus ID: 2-s2.0-105000782383OAI: oai:DiVA.org:kth-354832DiVA, id: diva2:1905519
Conference
NeurIPS 2024, the Thirty-Eighth Annual Conference on Neural Information Processing Systems, Vancouver, December 10-15, 2024
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20250408

Available from: 2024-10-14 Created: 2024-10-14 Last updated: 2025-04-08Bibliographically approved
In thesis
1. Robust and generalizable AI for medical image processing
Open this publication in new window or tab >>Robust and generalizable AI for medical image processing
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Artificial intelligence (AI) offers significant potential to enhance the accuracy and efficiency of medical diagnosis, monitoring, and treatment. In ovarian cancer, where 70% of cases are detected only at stage III or IV, AI-driven tools could enable earlier detection and improve patient outcomes.However, the safety-critical nature of medicine— where even minor errors can have serious consequences—has led to cautious adoption of AI technologies.To integrate AI  into clinical practice, it must not only demonstrate good performance, but also robustness and generalizability across diverse clinical settings.

This thesis investigates the development and evaluation of generalizable and robust AI systems, with a focus on medical image analysis. We begin by addressing key gaps in our understanding of how complexity influences generalization, exploring scaling laws across increasingly complex tasks and analyzing how the performance of foundation models is impacted. Foundation models are becoming vital for AI development in medical imaging, particularly in addressing data scarcity challenges. Adapting these models for medical applications often demands substantial computational resources, particularly due to their large size. To mitigate these computational demands, we propose an efficient method for adapting the robust representations of large foundation models trained on diverse datasets to specific medical tasks, aiming to make foundation models more accessible for medical use without compromising their effectiveness.

Using ovarian cancer as a case study, we develop and rigorously evaluate AI systems for ovarian tumor classification.Our systems demonstrate superior performance compared to both non-expert and expert doctors, with a strong emphasis on ensuring accuracy, generalizability across hospitals, and robustness across diverse patient subgroups.We implement a comprehensive evaluation strategy that tests the AI systems in varied clinical settings, ensuring that they maintain high performance.

Finally, we explore the integration of AI systems into clinical workflows, with a focus on the development of joint human-AI systems. By designing AI systems that collaborate effectively with healthcare professionals, we aim to enhance diagnostic accuracy, reduce doctors' workloads, and optimize the use of healthcare resources. Our collaborative human-AI system is designed to be generalizable across different clinical settings to improve patient care and advance the broader adoption of AI in medical practice, paving the way for more efficient and effective healthcare solutions.

Abstract [sv]

Artificiell intelligens (AI) har stor potential att förbättra träffsäkerheten och effektiviteten inom medicinsk diagnostik, uppföljning och behandling. Inom äggstockscancer, där 70% av fallen upptäcks först i stadium III eller IV, kan AI-drivna verktyg möjliggöra tidigare upptäckt och förbättra patienternas utfall. Men den säkerhetskritiska karaktären av medicin—där även mindre fel kan få allvarliga konsekvenser—har lett till en försiktig användning av AI-teknologier. För att AI ska kunna integreras i klinisk praxis är god prestanda ej tillräckligt, utan den måste även uppvisa robusthet och generaliserbarhet till olika kliniska miljöer.

Denna avhandling undersöker utvecklingen och utvärderingen av generaliserbara och robusta AI-system, med fokus på medicinsk bildanalys. Vi börjar med att adressera viktiga luckor i vår förståelse av hur komplexitet påverkar generalisering, genom att utforska skalningslagar över allt mer komplexa uppgifter och analysera hur prestandan hos grundmodeller påverkas. Grundmodeller blir allt viktigare för AI-utveckling inom medicinsk bildbehandling, särskilt vad det gäller att hantera utmaningar med otillräckliga datamängder. Att anpassa dessa modeller för medicinska tillämpningar kräver ofta betydande beräkningsresurser, särskilt på grund av deras stora storlek. För att minska dessa krav föreslår vi en effektiv metod för att anpassa de robusta representationerna av stora grundmodeller som tränats på mångsidiga dataset till specifika medicinska uppgifter, med målet att göra grundmodeller mer tillgängliga för medicinsk användning utan att kompromissa med deras effektivitet.

Med äggstockscancer som ett fallstudie utvecklar och utvärderar vi AI-system för klassificering av äggstockstumörer. Våra system visar överlägsen prestanda jämfört med både icke-experter och expertläkare, och visar på god träffsäkerhet, generaliserbarhet över sjukhus och robusthet över olika patientgrupper. Vi implementerar en omfattande utvärderingsstrategi som testar AI-systemen i olika kliniska miljöer och säkerställer att de bibehåller hög prestanda.

Slutligen undersöker vi integrationen av AI-system i kliniska arbetsflöden, med fokus på utvecklingen av system där människa och AI samverkar. Genom att designa AI-system som samarbetar effektivt med vårdpersonal strävar vi efter att förbättra diagnostisk träffsäkerhet, minska läkarnas arbetsbelastning samt optimera användningen av vårdresurser. Vårt samverkanssystem mellan människa och AI är utformat för att vara generaliserbart över olika kliniska miljöer för att förbättra patientvården och främja en bredare användning av AI inom medicinsk praxis, vilket banar väg för mer effektiva och ändamålsenliga vårdlösningar.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. vii, 88
Series
TRITA-EECS-AVL ; 2024:81
Keywords
Medical imaging, Generalization, Robustness, Uncertainty, Medicinsk avbildning, Generalisering, Robusthet, Osäkerhet
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-354838 (URN)978-91-8106-078-2 (ISBN)
Public defence
2024-11-08, Kollegiesalen, Brinellvägen 6, 114 28, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20241017

Available from: 2024-10-17 Created: 2024-10-15 Last updated: 2024-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

ScopusfulltextarXiv preprint

Authority records

Konuk, EmirMatsoukas, ChristosSorkhei, MoeinSmith, Kevin

Search in DiVA

By author/editor
Konuk, EmirMatsoukas, ChristosSorkhei, MoeinLertsiravarameth, PhitchaphaSmith, Kevin
By organisation
Computational Science and Technology (CST)Science for Life Laboratory, SciLifeLab
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 170 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf