kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 13) Show all publications
Hein, D., Chen, Z., Ostmeier, S., Xu, J., Varma, M., Reis, E. P., . . . Chaudhari, A. S. (2025). CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback.
Open this publication in new window or tab >>CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback
Show others...
2025 (English)Manuscript (preprint) (Other academic)
Abstract [en]

Radiologists play a crucial role in translating medical images into actionable reports. However, the field faces staffing shortages and increasing workloads. While automated approaches using vision-language models (VLMs) show promise as assistants, they require exceptionally high accuracy. Most current VLMs in radiology rely solely on supervised fine-tuning. Meanwhile, additional preference fine-tuning in the post-training pipeline has become standard practice in the general domain. The challenge in radiology lies in the prohibitive cost of obtaining radiologist feedback at scale. To address this challenge, we propose an automated pipeline for preference feedback, focusing on chest X-ray radiology report generation (RRG). Specifically, our method leverages publicly available datasets containing pairs of images and radiologist-written reference reports with reference-based metrics, or Judges, eliminating the need for additional radiologist feedback. We investigate reward overoptimization via length exploitation in this setting and introduce a length-controlled version of the GREEN score. Our best-performing setup achieves state-of-the-art CheXbert scores on the MIMIC-CXR dataset for the RRG task while on average maintaining robust performance across six additional image perception and reasoning tasks. 

National Category
Other Physics Topics
Identifiers
urn:nbn:se:kth:diva-363190 (URN)10.48550/arXiv.2410.07025 (DOI)
Note

QC 20250507

Available from: 2025-05-07 Created: 2025-05-07 Last updated: 2025-05-09Bibliographically approved
Hein, D., Stevens, G., Wang, A. & Wang, G. (2025). PFCM: Poisson flow consistency models for low-dose CT image denoising. IEEE Transactions on Medical Imaging, 44(7), 2989-3001
Open this publication in new window or tab >>PFCM: Poisson flow consistency models for low-dose CT image denoising
2025 (English)In: IEEE Transactions on Medical Imaging, ISSN 0278-0062, E-ISSN 1558-254X, Vol. 44, no 7, p. 2989-3001Article in journal (Refereed) Published
Abstract [en]

X-ray computed tomography (CT) is widely used for medical diagnosis and treatment planning; however, concerns about ionizing radiation exposure drive efforts to optimize image quality at lower doses. This study introduces Poisson Flow Consistency Models (PFCM), a novel family of deep generative models that combines the robustness of PFGM++ with the efficient single-step sampling of consistency models. PFCM are derived by generalizing consistency distillation to PFGM++ through a change-of-variables and an updated noise distribution. As a distilled version of PFGM++, PFCM inherit the ability to trade off robustness for rigidity via the hyperparameter D∈(0,∞) . A fact that we exploit to adapt this novel generative model for the task of low-dose CT image denoising, via a “task-specific” sampler that “hijacks” the generative process by replacing an intermediate state with the low-dose CT image. While this “hijacking” introduces a severe mismatch—the noise characteristics of low-dose CT images are different from that of intermediate states in the Poisson flow process—we show that the inherent robustness of PFCM at small D effectively mitigates this issue. The resulting sampler achieves excellent performance in terms of LPIPS, SSIM, and PSNR on the Mayo low-dose CT dataset. By contrast, an analogous sampler based on standard consistency models is found to be significantly less robust under the same conditions, highlighting the importance of a tunable D afforded by our novel framework. To highlight generalizability, we show effective denoising of clinical images from a prototype photon-counting system reconstructed using a sharper kernel and at a range of energy levels.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025
National Category
Other Physics Topics
Identifiers
urn:nbn:se:kth:diva-363187 (URN)10.1109/tmi.2025.3558019 (DOI)001523480800003 ()40215159 (PubMedID)2-s2.0-105002738267 (Scopus ID)
Note

QC 20250507

Available from: 2025-05-07 Created: 2025-05-07 Last updated: 2025-10-06Bibliographically approved
Hein, D., Bozorgpour, A., Merhof, D. & Wang, G. (2025). Physics-Inspired Generative Models in Medical Imaging. Annual Review of Biomedical Engineering, 27(1), 499-525
Open this publication in new window or tab >>Physics-Inspired Generative Models in Medical Imaging
2025 (English)In: Annual Review of Biomedical Engineering, ISSN 1523-9829, E-ISSN 1545-4274, Vol. 27, no 1, p. 499-525Article, review/survey (Refereed) Published
Abstract [en]

Physics-inspired generative models (GMs), in particular diffusion models and Poisson flow models, enhance Bayesian methods and promise great utility in medical imaging. This review examines the transformative role of such generative methods. First, a variety of physics-inspired GMs, including denoising diffusion probabilistic models, score-based diffusion models, and Poisson flow generative models (including PFGM++), are revisited, with an emphasis on their accuracy, robustness and acceleration. Then, major applications of physics-inspired GMs in medical imaging are presented, comprising image reconstruction, image generation, and image analysis. Finally, future research directions are brainstormed, including unification of physics-inspired GMs, integration with vision-language models, and potential novel applications of GMs. Since the development of generative methods has been rapid, it is hoped that this review will give peers and learners a timely snapshot of this new family of physics-driven GMs and help capitalize their enormous potential for medical imaging.

Place, publisher, year, edition, pages
Annual Reviews, 2025
Keywords
Bayesian theorem, consistency model, diffusion model, image analysis, image reconstruction, image/data synthesis, medical imaging, PFGM++, physics-inspired generative models, Poisson flow generative model
National Category
Medical Imaging
Identifiers
urn:nbn:se:kth:diva-363748 (URN)10.1146/annurev-bioeng-102723-013922 (DOI)001491920300020 ()40310888 (PubMedID)2-s2.0-105004481565 (Scopus ID)
Note

QC 20250522

Available from: 2025-05-21 Created: 2025-05-21 Last updated: 2025-07-03Bibliographically approved
Hein, D., Holmin, S., Prochazka, V., Yin, Z., Danielsson, M., Persson, M. & Wang, G. (2025). Syn2Real: synthesis of CT image ring artifacts for deep learning-based correction. Physics in Medicine and Biology, 70(4), Article ID 04NT01.
Open this publication in new window or tab >>Syn2Real: synthesis of CT image ring artifacts for deep learning-based correction
Show others...
2025 (English)In: Physics in Medicine and Biology, ISSN 0031-9155, E-ISSN 1361-6560, Vol. 70, no 4, article id 04NT01Article in journal (Refereed) Published
Abstract [en]

Objective. We strive to overcome the challenges posed by ring artifacts in x-ray computed tomography (CT) by developing a novel approach for generating training data for deep learning-based methods. Training such networks require large, high quality, datasets that are often generated in the data domain, time-consuming and expensive. Our objective is to develop a technique for synthesizing realistic ring artifacts directly in the image domain, enabling scalable production of training data without relying on specific imaging system physics. Approach. We develop 'Syn2Real,' a computationally efficient pipeline that generates realistic ring artifacts directly in the image domain. To demonstrate the effectiveness of our approach, we train two versions of UNet, vanilla and a high capacity version with self-attention layers that we call UNetpp, with & ell;2 and perceptual losses, as well as a diffusion model, on energy-integrating CT images with and without these synthetic ring artifacts. Main Results. Despite being trained on conventional single-energy CT images, our models effectively correct ring artifacts across various monoenergetic images, at different energy levels and slice thicknesses, from a prototype photon-counting CT system. This generalizability validates the realism and versatility of our ring artifact generation process. Significance. Ring artifacts in x-ray CT pose a unique challenge to image quality and clinical utility. By focusing on data generation, our work provides a foundation for developing more robust and adaptable ring artifact correction methods for pre-clinical, clinical and other CT applications.

Place, publisher, year, edition, pages
IOP Publishing, 2025
Keywords
deep learning, CT, photon-counting CT, ring artifacts, data synthesis, UNet
National Category
Radiology and Medical Imaging Medical Imaging Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-360399 (URN)10.1088/1361-6560/adad2c (DOI)001415391700001 ()39842097 (PubMedID)2-s2.0-85218222563 (Scopus ID)
Note

QC 20250226

Available from: 2025-02-26 Created: 2025-02-26 Last updated: 2025-05-08Bibliographically approved
Larsson, K., Hein, D., Huang, R., Collin, D., Scotti, A., Fredenberg, E., . . . Persson, M. (2024). Deep learning estimation of proton stopping power with photon-counting computed tomography: a virtual study. Journal of Medical Imaging, 11, Article ID S12809.
Open this publication in new window or tab >>Deep learning estimation of proton stopping power with photon-counting computed tomography: a virtual study
Show others...
2024 (English)In: Journal of Medical Imaging, ISSN 2329-4302, E-ISSN 2329-4310, Vol. 11, article id S12809Article in journal (Refereed) Published
Abstract [en]

Purpose: Proton radiation therapy may achieve precise dose delivery to the tumor while sparing non-cancerous surrounding tissue, owing to the distinct Bragg peaks of protons. Aligning the high-dose region with the tumor requires accurate estimates of the proton stopping power ratio (SPR) of patient tissues, commonly derived from computed tomography (CT) image data. Photon-counting detectors for CT have demonstrated advantages over their energy-integrating counterparts, such as improved quantitative imaging, higher spatial resolution, and filtering of electronic noise. We assessed the potential of photon-counting computed tomography (PCCT) for improving SPR estimation by training a deep neural network on a domain transform from PCCT images to SPR maps. Approach: The XCAT phantom was used to simulate PCCT images of the head with CatSim, as well as to compute corresponding ground truth SPR maps. The tube current was set to 260 mA, tube voltage to 120 kV, and number of view angles to 4000. The CT images and SPR maps were used as input and labels for training a U-Net. Results: Prediction of SPR with the network yielded average root mean square errors (RMSE) of 0.26% to 0.41%, which was an improvement on the RMSE for methods based on physical modeling developed for single-energy CT at 0.40% to 1.30% and dual-energy CT at 0.41% to 3.00%, performed on the simulated PCCT data. Conclusions: These early results show promise for using a combination of PCCT and deep learning for estimating SPR, which in extension demonstrates potential for reducing the beam range uncertainty in proton therapy.

Keywords
deep learning, photon-counting computed tomography, proton stopping power, proton therapy
National Category
Radiology, Nuclear Medicine and Medical Imaging Medical Imaging
Identifiers
urn:nbn:se:kth:diva-358410 (URN)10.1117/1.JMI.11.S1.S12809 (DOI)001386330400005 ()2-s2.0-85214080434 (Scopus ID)
Note

QC 20250122

Available from: 2025-01-15 Created: 2025-01-15 Last updated: 2025-01-22Bibliographically approved
Pandurevic, P., Back, A., Hein, D. & Persson, M. (2024). Impact of deep-learning CT image denoising on the accuracy of radiomics parameter estimation. In: Medical Imaging 2024: Physics of Medical Imaging: . Paper presented at Medical Imaging 2024: Physics of Medical Imaging, San Diego, United States of America, Feb 19 2024 - Feb 22 2024. SPIE-Intl Soc Optical Eng, Article ID 129252C.
Open this publication in new window or tab >>Impact of deep-learning CT image denoising on the accuracy of radiomics parameter estimation
2024 (English)In: Medical Imaging 2024: Physics of Medical Imaging, SPIE-Intl Soc Optical Eng , 2024, article id 129252CConference paper, Published paper (Refereed)
Abstract [en]

In CT radiomics, numerical parameters extracted from CT images are analyzed to find biomarkers. Since these numerical parameters can vary with imaging parameters, there is a need to optimize acquisition protocols for radiomics. In this work, we investigate the effect of deep-learning-based image reconstruction on the accuracy of radiomic parameters of tumors. We image a 3D printed lung phantom containing four tumors (ellipsoidal, lobulated, spherical, and spiculated), using the CAD model as ground truth. The phantom was 3D printed using fused deposition modeling with a PLA filament and 80% fill rate with a gyroidal pattern to mimic soft tissue. CT images of the 3D printed phantom and tumors were acquired with a GE revolution scanner with 120 kVp and 213 mAs. We reconstructed images using FBP and a vendor-supplied deep learning image reconstruction (DLIR) method (TrueFidelity, GE HealthCare). We also applied 24 custom convolutional neural network denoisers with a U-net architecture, trained on the AAPM-Mayo Clinic Low Dose CT dataset. After segmentation, 14 radiomic features were extracted using SlicerRadiomics. Results show that the vendor-supplied DLIR gave a smaller relative error than FBP for 87% of radiomic features. 8 out of 24 custom denoisers yielded a smaller error than FBP in 50% or more of the radiomic measurements. One denoiser, (VGG16+L1 loss, 32 features, batch size 16), outperformed FBP in 84% of measurements and outperformed the vendor-supplied DLIR in 63% of the measurements. In conclusion, our results demonstrate that deep-learning-based denoising has the potential to improve the accuracy of CT radiomics.

Place, publisher, year, edition, pages
SPIE-Intl Soc Optical Eng, 2024
Keywords
Computer Tomography, Deep Learning, Machine Learning, Perceptual loss, Radiomics
National Category
Radiology, Nuclear Medicine and Medical Imaging
Identifiers
urn:nbn:se:kth:diva-347136 (URN)10.1117/12.3006732 (DOI)001223517100071 ()2-s2.0-85193518465 (Scopus ID)
Conference
Medical Imaging 2024: Physics of Medical Imaging, San Diego, United States of America, Feb 19 2024 - Feb 22 2024
Note

Part of proceedings ISBN: 978-151067154-6

QC 20240610

Available from: 2024-06-03 Created: 2024-06-03 Last updated: 2024-06-14Bibliographically approved
Hein, D., Holmin, S., Szczykutowicz, T., Maltz, J. S., Danielsson, M., Wang, G. & Persson, M. (2024). Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models. Visual Computing for Industry, Biomedicine, and Art, 7(1), Article ID 24.
Open this publication in new window or tab >>Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models
Show others...
2024 (English)In: Visual Computing for Industry, Biomedicine, and Art, ISSN 2096-496X, Vol. 7, no 1, article id 24Article in journal (Refereed) Published
Abstract [en]

Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.

Place, publisher, year, edition, pages
Springer Nature, 2024
Keywords
Deep learning, Denoising, Diffusion models, Photon-counting CT, Poisson flow generative models
National Category
Computer graphics and computer vision Other Physics Topics
Identifiers
urn:nbn:se:kth:diva-354278 (URN)10.1186/s42492-024-00175-6 (DOI)001319529000001 ()2-s2.0-85204916575 (Scopus ID)
Note

QC 20241008

Available from: 2024-10-02 Created: 2024-10-02 Last updated: 2025-05-08Bibliographically approved
Hein, D., Holmin, S., Szczykutowicz, T., Maltz, J. S., Danielsson, M., Wang, G. & Persson, M. (2024). PPFM: Image Denoising in Photon-Counting CT Using Single-Step Posterior Sampling Poisson Flow Generative Models. IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES, 8(7), 788-799
Open this publication in new window or tab >>PPFM: Image Denoising in Photon-Counting CT Using Single-Step Posterior Sampling Poisson Flow Generative Models
Show others...
2024 (English)In: IEEE TRANSACTIONS ON RADIATION AND PLASMA MEDICAL SCIENCES, ISSN 2469-7311, Vol. 8, no 7, p. 788-799Article in journal (Refereed) Published
Abstract [en]

Diffusion and Poisson flow models have shown impressive performance in a wide range of generative tasks, including low-dose CT (LDCT) image denoising. However, one limitation in general, and for clinical applications in particular, is slow sampling. Due to their iterative nature, the number of function evaluations (NFEs) required is usually on the order of $10-10<^>{3}$ , both for conditional and unconditional generation. In this article, we present posterior sampling Poisson flow generative models (PPFMs), a novel image denoising technique for low-dose and photon-counting CT that produces excellent image quality whilst keeping NFE = 1. Updating the training and sampling processes of Poisson flow generative models (PFGMs)++, we learn a conditional generator which defines a trajectory between the prior noise distribution and the posterior distribution of interest. We additionally hijack and regularize the sampling process to achieve NFE = 1. Our results shed light on the benefits of the PFGM++ framework compared to diffusion models. In addition, PPFM is shown to perform favorably compared to current state-of-the-art diffusion-style models with NFE = 1, consistency models, as well as popular deep learning and nondeep learning-based image denoising techniques, on clinical LDCT images and clinical images from a prototype photon-counting CT system.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Computed tomography, Noise, Photonics, Image denoising, Training, Image reconstruction, Computational modeling, Deep learning, denoising, diffusion models, photon-counting computed tomography (PCCT), Poisson flow generative models (PFGMs)
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:kth:diva-354350 (URN)10.1109/TRPMS.2024.3410092 (DOI)001309978300005 ()2-s2.0-85196090996 (Scopus ID)
Note

QC 20241003

Available from: 2024-10-03 Created: 2024-10-03 Last updated: 2025-05-08Bibliographically approved
Larsson, K., Hein, D., Huang, R., Collin, D., Andersson, J. & Persson, M. (2024). Proton stopping power ratio prediction using photon-counting computed tomography and deep learning. In: Medical Imaging 2024: Physics of Medical Imaging: . Paper presented at Medical Imaging 2024: Physics of Medical Imaging, San Diego, United States of America, Feb 19 2024 - Feb 22 2024. SPIE-Intl Soc Optical Eng, Article ID 129252P.
Open this publication in new window or tab >>Proton stopping power ratio prediction using photon-counting computed tomography and deep learning
Show others...
2024 (English)In: Medical Imaging 2024: Physics of Medical Imaging, SPIE-Intl Soc Optical Eng , 2024, article id 129252PConference paper, Published paper (Refereed)
Abstract [en]

Proton radiation therapy has the potential of achieving precise dose delivery to the tumor while sparing noncancerous surrounding tissue, owing to the sharp Bragg peaks of protons. Aligning the high dose region with the tumor requires accurate estimates of the proton stopping power ratio (SPR) of patient tissues, commonly derived from computed tomography (CT) image data. Photon-counting detectors within CT have demonstrated advantages over their energy-integrating counterparts, such as improved quantitative imaging, higher spatial resolution and filtering of electronic noise. In this study, the potential of photon-counting computed tomography for improving SPR estimation was assessed by training a deep neural network on a domain transform from photon-counting CT images to SPR maps. XCAT phantoms of the head were generated and used to simulate photon-counting CT images with CatSim, as well as to compute corresponding ground truth SPR maps. The CT images and SPR maps were then used as input and labels to a neural network. Prediction of SPR with the network yielded mean root mean square errors (RMSE) of 0.26-0.41 %, which is an improvement on errors reported for methods based on dual energy CT (DECT). These early results show promise for using a combination of photon-counting CT and deep learning for predicting SPR, which in extension demonstrates potential for reducing the beam range uncertainty in proton therapy.

Place, publisher, year, edition, pages
SPIE-Intl Soc Optical Eng, 2024
Keywords
photon-counting computed tomography, Proton therapy, SPR
National Category
Radiology, Nuclear Medicine and Medical Imaging
Identifiers
urn:nbn:se:kth:diva-347127 (URN)10.1117/12.3006363 (DOI)001223517100083 ()2-s2.0-85193548416 (Scopus ID)
Conference
Medical Imaging 2024: Physics of Medical Imaging, San Diego, United States of America, Feb 19 2024 - Feb 22 2024
Note

QC 20240610

Part of ISBN 978-151067154-6

Available from: 2024-06-03 Created: 2024-06-03 Last updated: 2024-06-14Bibliographically approved
Hein, D. & Persson, M. (2023). Generation of Photon-counting Spectral CT Images Using a Score-based Diffusion Model. In: Proceedings 17th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully3D): . Paper presented at The 17th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully3D), July 16-21, 2023, Stony Brook, NY, U.S.A. (pp. 155-158).
Open this publication in new window or tab >>Generation of Photon-counting Spectral CT Images Using a Score-based Diffusion Model
2023 (English)In: Proceedings 17th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully3D), 2023, p. 155-158Conference paper, Published paper (Refereed)
National Category
Medical Imaging
Identifiers
urn:nbn:se:kth:diva-341662 (URN)
Conference
The 17th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully3D), July 16-21, 2023, Stony Brook, NY, U.S.A.
Note

QC 20231229

Available from: 2023-12-28 Created: 2023-12-28 Last updated: 2025-02-09Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-7051-6625

Search in DiVA

Show all publications