kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0001-6204-0778
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0003-0101-1505
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0001-5211-6388
3Karolinska Institutet, Stockholm, Sweden; Karolinska University Hospital, Stockholm, Sweden.
Show others and affiliations
2021 (English)In: Conference on Neural Information Processing Systems (NeurIPS) – Datasets and Benchmarks Proceedings, 2021., 2021Conference paper, Published paper (Refereed)
Abstract [en]

Interval and large invasive breast cancers, which are associated with worse prognosis than other cancers, are usually detected at a late stage due to false negative assessments of screening mammograms. The missed screening-time detection is commonly caused by the tumor being obscured by its surrounding breast tissues, a phenomenon called masking. To study and benchmark mammographic masking of cancer, in this work we introduce CSAW-M, the largest public mammographic dataset, collected from over 10,000 individuals and annotated with potential masking. In contrast to the previous approaches which measure breast image density as a proxy, our dataset directly provides annotations of masking potential assessments from five specialists. We also trained deep learning models on CSAW-M to estimate the masking level and showed that the estimated masking is significantly more predictive of screening participants diagnosed with interval and large invasive cancers – without being explicitly trained for these tasks – than its breast density counterparts.

Place, publisher, year, edition, pages
2021.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-340718OAI: oai:DiVA.org:kth-340718DiVA, id: diva2:1818664
Conference
35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks, 6-14 Dec 2021, virtual.
Note

QC 20231218

Available from: 2023-12-11 Created: 2023-12-11 Last updated: 2023-12-18Bibliographically approved
In thesis
1. Breast cancer risk assessment and detection in mammograms with artificial intelligence
Open this publication in new window or tab >>Breast cancer risk assessment and detection in mammograms with artificial intelligence
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Breast cancer, the most common type of cancer among women worldwide, necessitates reliable early detection methods. Although mammography serves as a cost-effective screening technique, its limitations in sensitivity emphasize the need for more advanced detection approaches. Previous studies have relied on breast density, extracted directly from the mammograms, as a primary metric for cancer risk assessment, given its correlation with increased cancer risk and the masking potential of cancer. However, such a singular metric overlooks image details and spatial relationships critical for cancer diagnosis. To address these limitations, this thesis integrates artificial intelligence (AI) models into mammography, with the goal of enhancing both cancer detection and risk estimation. 

In this thesis, we aim to establish a new benchmark for breast cancer prediction using neural networks. Utilizing the Cohort of Screen-Aged Women (CSAW) dataset, which includes mammography images from 2008 to 2015 in Stockholm, Sweden, we develop three AI models to predict inherent risk, cancer signs, and masking potential of cancer. Combined, these models can e↵ectively identify women in need of supplemental screening, even after a clean exam, paving the way for better early detection of cancer. Individually, important progress has been made on each of these component tasks as well. The risk prediction model, developed and tested on a large population-based cohort, establishes a new state-of-the-art at identifying women at elevated risk of developing breast cancer, outperforming traditional density measures. The risk model is carefully designed to avoid conflating image patterns re- lated to early cancers signs with those related to long-term risk. We also propose a method that allows vision transformers to eciently be trained on and make use of high-resolution images, an essential property for models analyzing mammograms. We also develop an approach to predict the masking potential in a mammogram – the likelihood that a cancer may be obscured by neighboring tissue and consequently misdiagnosed. High masking potential can complicate early detection and delay timely interventions. Along with the model, we curate and release a new public dataset which can help speed up progress on this important task. 

Through our research, we demonstrate the transformative potential of AI in mammographic analysis. By capturing subtle image cues, AI models consistently exceed the traditional baselines. These advancements not only highlight both the individual and combined advantages of the models, but also signal a transition to an era of AI-enhanced personalized healthcare, promising more ecient resource allocation and better patient outcomes. 

Abstract [sv]

Bröstcancer, den vanligaste cancerformen bland kvinnor globalt, kräver tillförlitliga metoder för tidig upptäckt. Även om mammografi fungerar som en kostnadseffektiv screeningteknik, understryker dess begränsningar i känslighet behovet av mer avancerade detektionsmetoder. Tidigare studier har förlitat sig på brösttäthet, utvunnen direkt från mammogram, som en primär indikator för riskbedömning, givet dess samband med ökad cancerrisk och cancermaskeringspotential. Visserligen förbiser en sådan enskild indikator bildinformation och spatiala relationer vilka är kritiska för cancerdiagnos. För att möta dessa begränsningar integrerar denna avhandling artificiell intelligens (AI) modeller i mammografi, med målet att förbättra både cancerdetektion och riskbedömning. 

I denna avhandling syftar vi till att fastställa en ny standard för bröstcancer-prediktion med hjälp av neurala nätverk. Genom att utnyttja datasetet Co-hort of Screen-Aged Women (CSAW), som inkluderar mammografier från 2008 till 2015 i Stockholm, Sverige, utvecklar vi tre AI modeller för att förutsäga inneboende risk, tecken på cancer och cancermaskeringspotential. Sammantaget kan dessa modeller effektivt identifiera kvinnor som behöver kompletterande screening, även efter en undersökning där patienten klassificerats som hälsosam, vilket banar väg för tidigare upptäckt av cancer. Individuellt har viktiga framsteg också gjorts i vardera modell. Riskdetektionsmodellen, utvecklad och testad på en stor populationsbaserad kohort, etablerar en ny state-of-the-art vid identifiering av kvinnor med ökad risk att utveckla bröstcancer, och presterar bättre än traditionella täthetsmodeller. Riskmodellen är noggrant utformad för att undvika att sammanblanda bildmönster relaterade till tidiga tecken på cancer med de som relaterar till långsiktig risk. Vi föreslår också en metod som gör det möjligt för vision transformers att effektivt tränas på samt utnyttja högupplösta bilder, en väsentlig egenskap för modeller som berör mammogram. Vi utvecklar också en metod för att förutsäga maskeringspotentialen i mammogram - sannolikheten att en cancer kan döljas av närliggande vävnad och följaktligen misstolkas. Hög maskeringspotential kan komplicera tidig upptäckt och försena ingripanden. Tillsammans med modellen sammanställer och släpper vi ett nytt offentligt dataset som kan hjälpa till att påskynda framsteg inom detta viktiga område. 

Genom vår forskning demonstrerar vi den transformativa potentialen med AI i mammografianalys. Genom att fånga subtila bildledtrådar överträffar AI-modeller konsekvent de traditionella baslinjerna. Dessa framsteg belyser inte bara de individuella och kombinerade fördelarna med modellerna, utan signalerar också ett paradigmskifte mot en era av AI-förstärkt personlig hälso- och sjukvård, med ett löfte om mer effektiv resursallokering och förbättrade patientresultat. 

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. xi, 61
Series
TRITA-EECS-AVL ; 2024:2
Keywords
Mammography, AI, Breast cancer risk, Breast cancer detection, Mammografi, AI, Bröstcancerrisk, Upptäckt av bröstcancer
National Category
Engineering and Technology Radiology, Nuclear Medicine and Medical Imaging
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-340723 (URN)978-91-8040-783-0 (ISBN)
Public defence
2024-01-18, Air & Fire, Science for Life Laboratory, Tomtebodavägen 23A, Solna, 14:00 (English)
Opponent
Supervisors
Note

QC 20231212

Available from: 2023-12-12 Created: 2023-12-11 Last updated: 2024-01-19Bibliographically approved

Open Access in DiVA

csaw-m(8445 kB)83 downloads
File information
File name FULLTEXT01.pdfFile size 8445 kBChecksum SHA-512
fbd83b9e429073c1a50d36e831c2cb81ee5f66944ef5a3d53af510c3bffbf9c68fd197c5a6a90c73255cb2d80e20f1ea5eb62838c729e0c61df57a31a16fee07
Type fulltextMimetype application/pdf

Other links

Published version

Authority records

Sorkhei, MoeinLiu, YueAzizpour, HosseinSmith, Kevin

Search in DiVA

By author/editor
Sorkhei, MoeinLiu, YueAzizpour, HosseinSmith, Kevin
By organisation
Computational Science and Technology (CST)Science for Life Laboratory, SciLifeLabRobotics, Perception and Learning, RPL
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 83 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 336 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf