kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Breast cancer risk assessment and detection in mammograms with artificial intelligence
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST).ORCID iD: 0000-0003-0101-1505
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Breast cancer, the most common type of cancer among women worldwide, necessitates reliable early detection methods. Although mammography serves as a cost-effective screening technique, its limitations in sensitivity emphasize the need for more advanced detection approaches. Previous studies have relied on breast density, extracted directly from the mammograms, as a primary metric for cancer risk assessment, given its correlation with increased cancer risk and the masking potential of cancer. However, such a singular metric overlooks image details and spatial relationships critical for cancer diagnosis. To address these limitations, this thesis integrates artificial intelligence (AI) models into mammography, with the goal of enhancing both cancer detection and risk estimation. 

In this thesis, we aim to establish a new benchmark for breast cancer prediction using neural networks. Utilizing the Cohort of Screen-Aged Women (CSAW) dataset, which includes mammography images from 2008 to 2015 in Stockholm, Sweden, we develop three AI models to predict inherent risk, cancer signs, and masking potential of cancer. Combined, these models can e↵ectively identify women in need of supplemental screening, even after a clean exam, paving the way for better early detection of cancer. Individually, important progress has been made on each of these component tasks as well. The risk prediction model, developed and tested on a large population-based cohort, establishes a new state-of-the-art at identifying women at elevated risk of developing breast cancer, outperforming traditional density measures. The risk model is carefully designed to avoid conflating image patterns re- lated to early cancers signs with those related to long-term risk. We also propose a method that allows vision transformers to eciently be trained on and make use of high-resolution images, an essential property for models analyzing mammograms. We also develop an approach to predict the masking potential in a mammogram – the likelihood that a cancer may be obscured by neighboring tissue and consequently misdiagnosed. High masking potential can complicate early detection and delay timely interventions. Along with the model, we curate and release a new public dataset which can help speed up progress on this important task. 

Through our research, we demonstrate the transformative potential of AI in mammographic analysis. By capturing subtle image cues, AI models consistently exceed the traditional baselines. These advancements not only highlight both the individual and combined advantages of the models, but also signal a transition to an era of AI-enhanced personalized healthcare, promising more ecient resource allocation and better patient outcomes. 

Abstract [sv]

Bröstcancer, den vanligaste cancerformen bland kvinnor globalt, kräver tillförlitliga metoder för tidig upptäckt. Även om mammografi fungerar som en kostnadseffektiv screeningteknik, understryker dess begränsningar i känslighet behovet av mer avancerade detektionsmetoder. Tidigare studier har förlitat sig på brösttäthet, utvunnen direkt från mammogram, som en primär indikator för riskbedömning, givet dess samband med ökad cancerrisk och cancermaskeringspotential. Visserligen förbiser en sådan enskild indikator bildinformation och spatiala relationer vilka är kritiska för cancerdiagnos. För att möta dessa begränsningar integrerar denna avhandling artificiell intelligens (AI) modeller i mammografi, med målet att förbättra både cancerdetektion och riskbedömning. 

I denna avhandling syftar vi till att fastställa en ny standard för bröstcancer-prediktion med hjälp av neurala nätverk. Genom att utnyttja datasetet Co-hort of Screen-Aged Women (CSAW), som inkluderar mammografier från 2008 till 2015 i Stockholm, Sverige, utvecklar vi tre AI modeller för att förutsäga inneboende risk, tecken på cancer och cancermaskeringspotential. Sammantaget kan dessa modeller effektivt identifiera kvinnor som behöver kompletterande screening, även efter en undersökning där patienten klassificerats som hälsosam, vilket banar väg för tidigare upptäckt av cancer. Individuellt har viktiga framsteg också gjorts i vardera modell. Riskdetektionsmodellen, utvecklad och testad på en stor populationsbaserad kohort, etablerar en ny state-of-the-art vid identifiering av kvinnor med ökad risk att utveckla bröstcancer, och presterar bättre än traditionella täthetsmodeller. Riskmodellen är noggrant utformad för att undvika att sammanblanda bildmönster relaterade till tidiga tecken på cancer med de som relaterar till långsiktig risk. Vi föreslår också en metod som gör det möjligt för vision transformers att effektivt tränas på samt utnyttja högupplösta bilder, en väsentlig egenskap för modeller som berör mammogram. Vi utvecklar också en metod för att förutsäga maskeringspotentialen i mammogram - sannolikheten att en cancer kan döljas av närliggande vävnad och följaktligen misstolkas. Hög maskeringspotential kan komplicera tidig upptäckt och försena ingripanden. Tillsammans med modellen sammanställer och släpper vi ett nytt offentligt dataset som kan hjälpa till att påskynda framsteg inom detta viktiga område. 

Genom vår forskning demonstrerar vi den transformativa potentialen med AI i mammografianalys. Genom att fånga subtila bildledtrådar överträffar AI-modeller konsekvent de traditionella baslinjerna. Dessa framsteg belyser inte bara de individuella och kombinerade fördelarna med modellerna, utan signalerar också ett paradigmskifte mot en era av AI-förstärkt personlig hälso- och sjukvård, med ett löfte om mer effektiv resursallokering och förbättrade patientresultat. 

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. , p. xi, 61
Series
TRITA-EECS-AVL ; 2024:2
Keywords [en]
Mammography, AI, Breast cancer risk, Breast cancer detection
Keywords [sv]
Mammografi, AI, Bröstcancerrisk, Upptäckt av bröstcancer
National Category
Engineering and Technology Radiology, Nuclear Medicine and Medical Imaging
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-340723ISBN: 978-91-8040-783-0 (print)OAI: oai:DiVA.org:kth-340723DiVA, id: diva2:1818672
Public defence
2024-01-18, Air & Fire, Science for Life Laboratory, Tomtebodavägen 23A, Solna, 14:00 (English)
Opponent
Supervisors
Note

QC 20231212

Available from: 2023-12-12 Created: 2023-12-11 Last updated: 2024-01-19Bibliographically approved
List of papers
1. Comparison of a deep learning risk score and standard mammographic density score for breast cancer risk prediction
Open this publication in new window or tab >>Comparison of a deep learning risk score and standard mammographic density score for breast cancer risk prediction
Show others...
2020 (English)In: Radiology, ISSN 0033-8419, E-ISSN 1527-1315, Vol. 294, no 2, p. 265-272Article in journal (Refereed) Published
Abstract [en]

Background: Most risk prediction models for breast cancer are based on questionnaires and mammographic density assessments. By training a deep neural network, further information in the mammographic images can be considered. Purpose: To develop a risk score that is associated with future breast cancer and compare it with density-based models. Materials and Methods: In this retrospective study, all women aged 40-74 years within the Karolinska University Hospital uptake area in whom breast cancer was diagnosed in 2013-2014 were included along with healthy control subjects. Network development was based on cases diagnosed from 2008 to 2012. The deep learning (DL) risk score, dense area, and percentage density were calculated for the earliest available digital mammographic examination for each woman. Logistic regression models were fitted to determine the association with subsequent breast cancer. False-negative rates were obtained for the DL risk score, age-adjusted dense area, and age-adjusted percentage density. Results: A total of 2283 women, 278 of whom were later diagnosed with breast cancer, were evaluated. The age at mammography (mean, 55.7 years vs 54.6 years; P< .001), the dense area (mean, 38.2 cm2 vs 34.2 cm2; P< .001), and the percentage density (mean, 25.6% vs 24.0%; P< .001) were higher among women diagnosed with breast cancer than in those without a breast cancer diagnosis. The odds ratios and areas under the receiver operating characteristic curve (AUCs) were higher for age-adjusted DL risk score than for dense area and percentage density: 1.56 (95% confidence interval [CI]: 1.48, 1.64; AUC, 0.65), 1.31 (95% CI: 1.24, 1.38; AUC, 0.60), and 1.18 (95% CI: 1.11, 1.25; AUC, 0.57), respectively (P< .001 for AUC). The false-negative rate was lower: 31% (95% CI: 29%, 34%), 36% (95% CI: 33%, 39%; P = .006), and 39% (95% CI: 37%, 42%; P< .001); this difference was most pronounced for more aggressive cancers. Conclusion: Compared with density-based models, a deep neural network can more accurately predict which women are at risk for future breast cancer, with a lower false-negative rate for more aggressive cancers.

Place, publisher, year, edition, pages
Radiological Society of North America Inc., 2020
National Category
Radiology, Nuclear Medicine and Medical Imaging
Identifiers
urn:nbn:se:kth:diva-267834 (URN)10.1148/radiol.2019190872 (DOI)000508455500006 ()31845842 (PubMedID)2-s2.0-85078538925 (Scopus ID)
Note

QC 20200227

Available from: 2020-02-27 Created: 2020-02-27 Last updated: 2024-03-15Bibliographically approved
2. Decoupling Inherent Risk and Early Cancer Signs in Image-Based Breast Cancer Risk Models
Open this publication in new window or tab >>Decoupling Inherent Risk and Early Cancer Signs in Image-Based Breast Cancer Risk Models
2020 (English)In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2020: 23rd International Conference, Lima, Peru, October 4–8, 2020, Proceedings, Part VI (Lecture Notes in Computer Science), Springer Nature , 2020, Vol. 12266, p. 230-240Conference paper, Published paper (Refereed)
Abstract [en]

The ability to accurately estimate risk of developing breast cancer would be invaluable for clinical decision-making. One promising new approach is to integrate image-based risk models based on deep neural networks. However, one must take care when using such models, as selection of training data influences the patterns the network will learn to identify. With this in mind, we trained networks using three different criteria to select the positive training data (i.e. images from patients that will develop cancer): an inherent risk model trained on images with no visible signs of cancer, a cancer signs model trained on images containing cancer or early signs of cancer, and a conflated model trained on all images from patients with a cancer diagnosis. We find that these three models learn distinctive features that focus on different patterns, which translates to contrasts in performance. Short-term risk is best estimated by the cancer signs model, whilst long-term risk is best estimated by the inherent risk model. Carelessly training with all images conflates inherent risk with early cancer signs, and yields sub-optimal estimates in both regimes. As a consequence, conflated models may lead physicians to recommend preventative action when early cancer signs are already visible.

Place, publisher, year, edition, pages
Springer Nature, 2020
Series
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN 0302-9743 ; 12266
Keywords
Deep learning, Mammography, Risk prediction
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-291719 (URN)10.1007/978-3-030-59725-2_23 (DOI)2-s2.0-85092769948 (Scopus ID)
Conference
23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2020; Lima; Peru; 4 October 2020 through 8 October 2020
Note

QC 20210323

Available from: 2021-03-23 Created: 2021-03-23 Last updated: 2023-12-11Bibliographically approved
3. CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer
Open this publication in new window or tab >>CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer
Show others...
2021 (English)In: Conference on Neural Information Processing Systems (NeurIPS) – Datasets and Benchmarks Proceedings, 2021., 2021Conference paper, Published paper (Refereed)
Abstract [en]

Interval and large invasive breast cancers, which are associated with worse prognosis than other cancers, are usually detected at a late stage due to false negative assessments of screening mammograms. The missed screening-time detection is commonly caused by the tumor being obscured by its surrounding breast tissues, a phenomenon called masking. To study and benchmark mammographic masking of cancer, in this work we introduce CSAW-M, the largest public mammographic dataset, collected from over 10,000 individuals and annotated with potential masking. In contrast to the previous approaches which measure breast image density as a proxy, our dataset directly provides annotations of masking potential assessments from five specialists. We also trained deep learning models on CSAW-M to estimate the masking level and showed that the estimated masking is significantly more predictive of screening participants diagnosed with interval and large invasive cancers – without being explicitly trained for these tasks – than its breast density counterparts.

National Category
Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-340718 (URN)
Conference
35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks, 6-14 Dec 2021, virtual.
Note

QC 20231218

Available from: 2023-12-11 Created: 2023-12-11 Last updated: 2023-12-18Bibliographically approved
4. PatchDropout: Economizing Vision Transformers Using Patch Dropout
Open this publication in new window or tab >>PatchDropout: Economizing Vision Transformers Using Patch Dropout
Show others...
2023 (English)In: 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), Institute of Electrical and Electronics Engineers (IEEE) , 2023, p. 3942-3951Conference paper, Published paper (Refereed)
Abstract [en]

Vision transformers have demonstrated the potential to outperform CNNs in a variety of vision tasks. But the computational and memory requirements of these models prohibit their use in many applications, especially those that depend on high-resolution images, such as medical image classification. Efforts to train ViTs more efficiently are overly complicated, necessitating architectural changes or intricate training schemes. In this work, we show that standard ViT models can be efficiently trained at high resolution by randomly dropping input image patches. This simple approach, PatchDropout, reduces FLOPs and memory by at least 50% in standard natural image datasets such as IMAGENET, and those savings only increase with image size. On CSAW, a high-resolution medical dataset, we observe a 5. savings in computation and memory using PatchDropout, along with a boost in performance. For practitioners with a fixed computational or memory budget, PatchDropout makes it possible to choose image resolution, hyperparameters, or model size to get the most performance out of their model.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023
Series
IEEE Winter Conference on Applications of Computer Vision, ISSN 2472-6737
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
urn:nbn:se:kth:diva-333235 (URN)10.1109/WACV56688.2023.00394 (DOI)000971500204006 ()2-s2.0-85149011721 (Scopus ID)
Conference
23rd IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), JAN 03-07, 2023, Waikoloa, HI
Note

QC 20230731

Available from: 2023-07-31 Created: 2023-07-31 Last updated: 2023-12-11Bibliographically approved
5. Selecting Women for Supplemental Breast Imaging using AI Biomarkers of Cancer Signs, Masking, and Risk
Open this publication in new window or tab >>Selecting Women for Supplemental Breast Imaging using AI Biomarkers of Cancer Signs, Masking, and Risk
Show others...
2023 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Background: Traditional mammographic density aids in determining the need for supplemental imagingby MRI or ultrasound. However, AI image analysis, considering more subtle and complex image features,may enable a more effective identification of women requiring supplemental imaging.Purpose: To assess if AISmartDensity, an AI-based score considering cancer signs, masking, and risk,surpasses traditional mammographic density in identifying women for supplemental imaging after negativescreening mammography.Methods: This retrospective study included randomly selected breast cancer patients and healthy controlsat Karolinska University Hospital between 2008 and 2015. Bootstrapping simulated a 0.2% interval cancerrate. We included previous exams for diagnosed women and all exams for controls. AISmartDensity hadbeen developed using random mammograms from a population non-overlapping with the current studypopulation. We evaluated AISmartDensity to, based on negative screening mammograms, identify womenwith interval cancer and next-round screen-detected cancer. It was compared to age and density models, withsensitivity and PPV calculated for women with the top 8% scores, mimicking the proportion of BIRADS“extremely dense” category. Statistical significance was determined using the Student’s t-test.Results: The study involved 2043 women, 258 with breast cancer diagnosed within 3 years of a negativemammogram, and 1785 healthy controls. Diagnosed women had a median age of 57 years (IQR 16) versus53 years (IQR 15) for controls (p < .001). At the 92nd percentile, AISmartDenstiy identified 87 (33.67%)future cancers with PPV 1.68%, whereas mammographic density identified 34 (13.18%) with PPV 0.66%(p < .001). AISmartDensity identified 32% interval and 36% next-round cancers, versus mammographicdensity’s 16% and 10%. The combined mammographic density and age model yielded an AUC of 0.60,significantly lower than AISmartDensity’s 0.73 (p < .001).Conclusions: AISmartDensity, integrating cancer signs, masking, and risk, more effectively identifiedwomen for additional breast imaging than traditional age and density models. 

National Category
Medical and Health Sciences Engineering and Technology
Identifiers
urn:nbn:se:kth:diva-340721 (URN)
Note

QC 20231218

Available from: 2023-12-11 Created: 2023-12-11 Last updated: 2023-12-18Bibliographically approved

Open Access in DiVA

kappa(10216 kB)600 downloads
File information
File name FULLTEXT01.pdfFile size 10216 kBChecksum SHA-512
7daf15bb7494e9f1547153f64bab36d560a760646ad42b1a2f1910c0f6250dc0b475f7f552d062a8bb6baca00a385f80af8617de75d77fd40afd37539fb84884
Type fulltextMimetype application/pdf

Authority records

Liu, Yue

Search in DiVA

By author/editor
Liu, Yue
By organisation
Computational Science and Technology (CST)
Engineering and TechnologyRadiology, Nuclear Medicine and Medical Imaging

Search outside of DiVA

GoogleGoogle Scholar
Total: 600 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1522 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf