kth.sePublications KTH
Operational message
There are currently operational disruptions. Troubleshooting is in progress.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
International multicenter validation of AI-driven ultrasound detection of ovarian cancer
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0001-9437-4553
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Biomedical Engineering and Health Systems.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).
Show others and affiliations
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Ovarian lesions are common and often incidentally detected. A critical shortage of expert ultrasound examiners has raised concerns of unnecessary interventions and delayed cancer diagnoses. Deep learning has shown promising results in the detection of ovarian cancer in ultrasound images; however, external validation is lacking. In this international multicenter retrospective study, we developed and validated transformer-based neural network models using a comprehensive dataset of 17,119 ultrasound images from 3,652 patients across 20 centers in eight countries. Using a leave-one-center-out cross-validation scheme, for each center in turn, we trained a model using data from the remaining centers. The models demonstrated robust performance across centers, ultrasound systems, histological diagnoses and patient age groups, significantly outperforming both expert and non-expert examiners on all evaluated metrics, namely F1 score, sensitivity, specificity, accuracy, Cohen’s kappa, Matthew’s correlation coefficient, diagnostic odds ratio and Youden’s J statistic. Furthermore, in a retrospective triage simulation, artificial intelligence (AI)-driven diagnostic support reduced referrals to experts by 63% while significantly surpassing the diagnostic performance of the current practice. These results show that transformer-based models exhibit strong generalization and above human expert-level diagnostic accuracy, with the potential to alleviate the shortage of expert ultrasound examiners and improve patient outcomes.

Keywords [en]
Deep learning, Generalization, External validity, Ultrasound, Ovarian cancer
National Category
Computer graphics and computer vision
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-354833OAI: oai:DiVA.org:kth-354833DiVA, id: diva2:1905526
Note

QC 20241015

Accepted for publication

Available from: 2024-10-14 Created: 2024-10-14 Last updated: 2025-02-07Bibliographically approved
In thesis
1. Robust and generalizable AI for medical image processing
Open this publication in new window or tab >>Robust and generalizable AI for medical image processing
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Artificial intelligence (AI) offers significant potential to enhance the accuracy and efficiency of medical diagnosis, monitoring, and treatment. In ovarian cancer, where 70% of cases are detected only at stage III or IV, AI-driven tools could enable earlier detection and improve patient outcomes.However, the safety-critical nature of medicine— where even minor errors can have serious consequences—has led to cautious adoption of AI technologies.To integrate AI  into clinical practice, it must not only demonstrate good performance, but also robustness and generalizability across diverse clinical settings.

This thesis investigates the development and evaluation of generalizable and robust AI systems, with a focus on medical image analysis. We begin by addressing key gaps in our understanding of how complexity influences generalization, exploring scaling laws across increasingly complex tasks and analyzing how the performance of foundation models is impacted. Foundation models are becoming vital for AI development in medical imaging, particularly in addressing data scarcity challenges. Adapting these models for medical applications often demands substantial computational resources, particularly due to their large size. To mitigate these computational demands, we propose an efficient method for adapting the robust representations of large foundation models trained on diverse datasets to specific medical tasks, aiming to make foundation models more accessible for medical use without compromising their effectiveness.

Using ovarian cancer as a case study, we develop and rigorously evaluate AI systems for ovarian tumor classification.Our systems demonstrate superior performance compared to both non-expert and expert doctors, with a strong emphasis on ensuring accuracy, generalizability across hospitals, and robustness across diverse patient subgroups.We implement a comprehensive evaluation strategy that tests the AI systems in varied clinical settings, ensuring that they maintain high performance.

Finally, we explore the integration of AI systems into clinical workflows, with a focus on the development of joint human-AI systems. By designing AI systems that collaborate effectively with healthcare professionals, we aim to enhance diagnostic accuracy, reduce doctors' workloads, and optimize the use of healthcare resources. Our collaborative human-AI system is designed to be generalizable across different clinical settings to improve patient care and advance the broader adoption of AI in medical practice, paving the way for more efficient and effective healthcare solutions.

Abstract [sv]

Artificiell intelligens (AI) har stor potential att förbättra träffsäkerheten och effektiviteten inom medicinsk diagnostik, uppföljning och behandling. Inom äggstockscancer, där 70% av fallen upptäcks först i stadium III eller IV, kan AI-drivna verktyg möjliggöra tidigare upptäckt och förbättra patienternas utfall. Men den säkerhetskritiska karaktären av medicin—där även mindre fel kan få allvarliga konsekvenser—har lett till en försiktig användning av AI-teknologier. För att AI ska kunna integreras i klinisk praxis är god prestanda ej tillräckligt, utan den måste även uppvisa robusthet och generaliserbarhet till olika kliniska miljöer.

Denna avhandling undersöker utvecklingen och utvärderingen av generaliserbara och robusta AI-system, med fokus på medicinsk bildanalys. Vi börjar med att adressera viktiga luckor i vår förståelse av hur komplexitet påverkar generalisering, genom att utforska skalningslagar över allt mer komplexa uppgifter och analysera hur prestandan hos grundmodeller påverkas. Grundmodeller blir allt viktigare för AI-utveckling inom medicinsk bildbehandling, särskilt vad det gäller att hantera utmaningar med otillräckliga datamängder. Att anpassa dessa modeller för medicinska tillämpningar kräver ofta betydande beräkningsresurser, särskilt på grund av deras stora storlek. För att minska dessa krav föreslår vi en effektiv metod för att anpassa de robusta representationerna av stora grundmodeller som tränats på mångsidiga dataset till specifika medicinska uppgifter, med målet att göra grundmodeller mer tillgängliga för medicinsk användning utan att kompromissa med deras effektivitet.

Med äggstockscancer som ett fallstudie utvecklar och utvärderar vi AI-system för klassificering av äggstockstumörer. Våra system visar överlägsen prestanda jämfört med både icke-experter och expertläkare, och visar på god träffsäkerhet, generaliserbarhet över sjukhus och robusthet över olika patientgrupper. Vi implementerar en omfattande utvärderingsstrategi som testar AI-systemen i olika kliniska miljöer och säkerställer att de bibehåller hög prestanda.

Slutligen undersöker vi integrationen av AI-system i kliniska arbetsflöden, med fokus på utvecklingen av system där människa och AI samverkar. Genom att designa AI-system som samarbetar effektivt med vårdpersonal strävar vi efter att förbättra diagnostisk träffsäkerhet, minska läkarnas arbetsbelastning samt optimera användningen av vårdresurser. Vårt samverkanssystem mellan människa och AI är utformat för att vara generaliserbart över olika kliniska miljöer för att förbättra patientvården och främja en bredare användning av AI inom medicinsk praxis, vilket banar väg för mer effektiva och ändamålsenliga vårdlösningar.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. vii, 88
Series
TRITA-EECS-AVL ; 2024:81
Keywords
Medical imaging, Generalization, Robustness, Uncertainty, Medicinsk avbildning, Generalisering, Robusthet, Osäkerhet
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-354838 (URN)978-91-8106-078-2 (ISBN)
Public defence
2024-11-08, Kollegiesalen, Brinellvägen 6, 114 28, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20241017

Available from: 2024-10-17 Created: 2024-10-15 Last updated: 2024-10-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Authority records

Konuk, EmirRaju, AdithyaHuix, Joana PalésHerman, PawelSmith, Kevin

Search in DiVA

By author/editor
Konuk, EmirRaju, AdithyaWelch, RobertHuix, Joana PalésHerman, PawelSmith, Kevin
By organisation
Computational Science and Technology (CST)Science for Life Laboratory, SciLifeLabBiomedical Engineering and Health SystemsSeRC - Swedish e-Science Research Centre
Computer graphics and computer vision

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 244 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf