kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
High Performance Computing for the Optimization of Radiation Therapy Treatment Plans
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Computational Science and Technology (CST). RaySearch Laboratories.ORCID iD: 0000-0001-6865-9379
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Radiation therapy is a clinical field in which computer simulations play a crucial role. Before patients undergo radiation therapy, an individual treatment plan for each patient needs to be created based on the specifics of their case (a process often referred to as treatment planning). The main aspect of this is setting the control parameters for the treatment machine so that the radioactive dose delivered to the patient is as concentrated to the tumor volume as possible. The inverse problem of determining such appropriate control parameters is typically formulated as an optimization problem, which, considering the complexity of modern treatment machines, requires computerized algorithms to solve accurately. Solving this optimization problem can be a key computational bottleneck in the treatment planning workflow. In many cases, finding a suitable treatment plan is a trial-and-error process, requiring multiple solutions of the optimization problems with different weights and parameters.

This thesis proposes different methods to enable the use of high performance computing (HPC) hardware to accelerate the optimization process in radiation therapy. We deal with two main computational aspects of the optimization workflow, the calculation of dose, gradients and objective functions; as well as the optimization solver itself. For dose calculation during optimization, we propose a CUDA kernel for sparse matrix-vector products tuned to dose deposition matrices from proton therapy. For the evaluation of the objective function -- which is often constructed as a weighted sum -- we develop a method to distribute the calculation of the objective function and its gradient across computational nodes using message passing.

For the optimization solver itself we first propose a task-based parallel implementation for Cholesky factorization of banded matrices. This can be an important kernel in interior point methods (IPM) for optimization, depending on the structure of the optimization constraints. The final two papers in this thesis deal with the adoption of iterative linear algebra in IPMs to enable GPU acceleration of the optimization solve. We develop an IPM using a doubly augmented formulation of the KKT linear system and Jacobi preconditioned conjugate gradient method, which we show can solve problems to acceptable accuracy as well as benefit substantially from GPU acceleration. Benchmarks against the commercial treatment planning system RayStation shows that our solver can improve performance by up to 4.4 times on some realistic cases.

Abstract [sv]

Strålterapi är en klinisk gren där datorsimuleringar spelar en viktig roll. Innan patienters strålbehandlingar påbörjas behövs individuella behandlingsplaner skapas baserat på varje patientfalls specifika omständigheter (en process som ofta kallas dosplanering (eng. treament planning)). Huvudaspekten av denna process är att bestämma styrparametrar för behandlingsmaskinen så att stråldosen som levereras till patienten är så koncentrerad till tumören som möjligt. Det inversa problemet att bestämma sådana styrparametrar formuleras typiskt som ett optimeringsproblem, vilket, givet hur komplexa moderna behandlingsmaskiner är, kräver datoralgoritmer för att lösas precist. Att lösa detta optimeringsproblem kan vara en viktig flaskhals i dosplaneringsprocessen. I många fall är det en iterativ process att hitta en lämplig behandlingsplan som kräver att man testar och löser optimeringsproblemet med flera olika vikter och parametrar.

I denna avhandling studerar vi metoder för att möjliggöra användandet av hårdvara från högprestandaberäkningar för att accelerera optimeringsprocessen i strålterapi. Vi angriper två olika beräkningsaspekter i lösningen av optimeringsproblemet: beräkningen av stråldos, gradienter och målfunktioner, samt lösningen av optimeringsproblemet i sig. För dosberäkningen som krävs under optimeringen utvecklar vi en beräkningsmetod i CUDA för matris-vektor produkter med glesa matriser som är specifikt anpassad för dosmatriser (eng. dose deposition matrices) från protonterapi. För beräkningen av målfunktionen --- som i många fall är formulerad som en viktad summa --- utvecklar vi en metod för att distribuera beräkningen av målfunktionen och dess gradient över flera beräkningsnoder genom message passing.

För optimeringslösaren själv så utvecklar vi först en parallel implementation av Cholesky-faktorisering för bandade matriser med parallelisering baserad på tasks. Just den beräkningsalgoritmen kan vara viktig för många inrepunktsmetoder för optimering, baserat på strukturen av bivillkoren i optimeringsproblemet. De sista två publikationerna i avhandlingen studerar användandet av iterativa linjära lösare i inrepunktsmetoder för att möjliggöra GPU-accelerering av av optimeringslösaren. Vi utvecklar en inrepunksmetod med en dubbelaugmenterad formulering av KKT-systemet och en Jacobi-förkonditionerad CG-lösare (eng. conjugate gradient method). Vi demonstrerar att våran metod kan lösa optimeringsproblemen till tillräcklig noggranhet och dessutom ha stor nytta av GPU-acceleration. Prestandajämförelser med det kommersiella dosplaneringssystemet RayStation visar att våran lösare kan förbättra lösningstiden med upp till 4.4 gånger på några realistiska patientfall.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. , p. 73
Series
TRITA-EECS-AVL ; 2024:35
Keywords [en]
HPC, Radiation Therapy, Optimization, Numerical Linear Algebra, Interior Point Methods
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-345836ISBN: 978-91-8040-907-0 (print)OAI: oai:DiVA.org:kth-345836DiVA, id: diva2:1853280
Public defence
2024-05-22, https://kth-se.zoom.us/j/69603435226, F3, Lindstedtsvägen 26, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20140423

Available from: 2024-04-23 Created: 2024-04-22 Last updated: 2024-04-29Bibliographically approved
List of papers
1. Accelerating Radiation Therapy Dose Calculation with Nvidia GPUs
Open this publication in new window or tab >>Accelerating Radiation Therapy Dose Calculation with Nvidia GPUs
Show others...
2021 (English)In: IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Institute of Electrical and Electronics Engineers (IEEE) , 2021Conference paper, Published paper (Refereed)
Abstract [en]

Radiation Treatment Planning (RTP) is the process of planning the appropriate external beam radiotherapy to combat cancer in human patients. RTP is a complex and compute-intensive task, which often takes a long time (several hours) to compute. Reducing this time allows for higher productivity at clinics and more sophisticated treatment planning, which can materialize in better treatments. The state-of-the-art in medical facilities uses general-purpose processors (CPUs) to perform many steps in the RTP process. In this paper, we explore the use of accelerators to reduce RTP calculating time. We focus on the step that calculates the dose using the Graphics Processing Unit (GPU), which we believe is an excellent candidate for this computation type. Next, we create a highly optimized implementation for a custom Sparse Matrix-Vector Multiplication (SpMV) that operates on numerical formats unavailable in state-of-the-art SpMV libraries (e.g., Ginkgo and cuSPARSE). We show that our implementation is several times faster than the baseline (up-to 4x) and has a higher operational intensity than similar (but different) versions such as Ginkgo and cuSPARSE.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2021
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-296824 (URN)10.1109/IPDPSW52791.2021.00076 (DOI)000689576200059 ()2-s2.0-85114391525 (Scopus ID)
Conference
2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 17-21 June 2021, Portland, OR, USA
Note

Part of proceedings: ISBN 978-1-6654-3577-2

QC 20230117

Available from: 2021-06-10 Created: 2021-06-10 Last updated: 2024-04-22Bibliographically approved
2. Distributed Objective Function Evaluation for Optimization of Radiation Therapy Treatment Plans
Open this publication in new window or tab >>Distributed Objective Function Evaluation for Optimization of Radiation Therapy Treatment Plans
2023 (English)In: PPAM 2022. Lecture Notes in Computer Science, vol 13826., Springer Nature , 2023Conference paper, Published paper (Refereed)
Abstract [en]

The modern workflow for radiation therapy treatment planning involves mathematical optimization to determine optimal treatment machine parameters for each patient case. The optimization problems can be computationally expensive, requiring iterative optimization algorithms to solve. In this work, we investigate a method for distributing the calculation of objective functions and gradients for radiation therapy optimization problems across computational nodes. We test our approach on the TROTS dataset--- which consists of optimization problems from real clinical patient cases---using the IPOPT optimization solver in a leader/follower type approach for parallelization. We show that our approach can utilize multiple computational nodes efficiently, with a speedup of approximately 2-3.5 times compared to the serial version.

Place, publisher, year, edition, pages
Springer Nature, 2023
National Category
Engineering and Technology
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-327020 (URN)10.1007/978-3-031-30442-2_29 (DOI)2-s2.0-85161412381 (Scopus ID)
Conference
PPAM 2022: Parallel Processing and Applied Mathematics
Note

QC 20230523

Available from: 2023-05-17 Created: 2023-05-17 Last updated: 2024-08-28Bibliographically approved
3. A survey of HPC algorithms and frameworks for large-scale gradient-based nonlinear optimization
Open this publication in new window or tab >>A survey of HPC algorithms and frameworks for large-scale gradient-based nonlinear optimization
2022 (English)In: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 78, no 16, p. 17513-17542Article in journal (Refereed) Published
Abstract [en]

Large-scale numerical optimization problems arise from many fields and have applications in both industrial and academic contexts. Finding solutions to such optimization problems efficiently requires algorithms that are able to leverage the increasing parallelism available in modern computing hardware. In this paper, we review previous work on parallelizing algorithms for nonlinear optimization. To introduce the topic, the paper starts by giving an accessible introduction to nonlinear optimization and high-performance computing. This is followed by a survey of previous work on parallelization and utilization of high-performance computing hardware for nonlinear optimization algorithms. Finally, we present a number of optimization software libraries and how they are able to utilize parallel computing today. This study can serve as an introduction point for researchers interested in nonlinear optimization or high-performance computing, as well as provide ideas and inspiration for future work combining these topics. 

Place, publisher, year, edition, pages
Springer Nature, 2022
Keywords
HPC, Interior-point method, Nonlinear optimization, Parallel computing, Nonlinear programming, Computing hardware, Gradient based, Large-scales, Non-linear optimization, Numerical optimizations, Optimization problems, Parallel com- puting, Performance computing, Surveys
National Category
Control Engineering
Identifiers
urn:nbn:se:kth:diva-323843 (URN)10.1007/s11227-022-04555-8 (DOI)000799541400003 ()2-s2.0-85130240510 (Scopus ID)
Note

QC 20230220

Available from: 2023-02-20 Created: 2023-02-20 Last updated: 2024-04-22Bibliographically approved
4. Parallel Cholesky Factorization for Banded Matrices Using OpenMP Tasks
Open this publication in new window or tab >>Parallel Cholesky Factorization for Banded Matrices Using OpenMP Tasks
2023 (English)In: Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Proceedings, Springer Nature , 2023, p. 725-739Conference paper, Published paper (Refereed)
Abstract [en]

Cholesky factorization is a method for solving linear systems involving symmetric, positive-definite matrices, and can be an attractive choice in applications where a high degree of numerical stability is needed. One such application is mathematical optimization, where direct methods for solving linear systems are widely used and often a significant performance bottleneck. An example where this is the case, and the specific type of optimization problem motivating this work, is radiation therapy treatment planning, where mathematical optimization is used to create individual treatment plans for patients. To address this bottleneck, we propose a task-based multi-threaded method for Cholesky factorization of banded matrices with medium-sized bands. We implement our algorithm using OpenMP tasks and compare our performance with state-of-the-art libraries such as Intel MKL. Our performance measurements show a performance that is on par or better than Intel MKL (up to ∼ 26 % on a single CPU socket) for a wide range of matrix bandwidths on two different Intel CPU systems.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Cholesky factorization, Linear Solver, Task-Based Parallelism
National Category
Computer Systems Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-337877 (URN)10.1007/978-3-031-39698-4_49 (DOI)2-s2.0-85171536146 (Scopus ID)
Conference
29th International European Conference on Parallel and Distributed Computing, Euro-Par 2023, Limassol, Cyprus, Aug 28 2023 - Sep 1 2023
Note

Part of ISBN 9783031396977

QC 20231010

Available from: 2023-10-10 Created: 2023-10-10 Last updated: 2024-04-22Bibliographically approved
5. Krylov Solvers for Interior Point Methods with Applications in Radiation Therapy and Support Vector Machines
Open this publication in new window or tab >>Krylov Solvers for Interior Point Methods with Applications in Radiation Therapy and Support Vector Machines
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Interior point methods are widely used for different types of mathematical optimization problems. Many implementations of interior point methods in use today rely on direct linear solvers to solve systems of equations in each iteration. The need to solve ever larger optimization problems more efficiently and the rise of hardware accelerators for general purpose computing has led to a large interest in using iterative linear solvers instead, with the major issue being inevitable ill-conditioning of the linear systems arising as the optimization progresses. We investigate the use of Krylov solvers for interior point methods in solving optimization problems from radiation therapy and support vector machines. We implement a prototype interior point method using a so called doubly augmented formulation of the Karush-Kuhn-Tucker linear system of equations, originally proposed by Forsgren and Gill, and evaluate its performance on real optimization problems from radiation therapy and support vector machines. Crucially, our implementation uses a preconditioned conjugate gradient method with Jacobi preconditioning internally. Our measurements of the conditioning of the linear systems indicate that the Jacobi preconditioner improves the conditioning of the systems to a degree that they can be solved iteratively, but there is room for further improvement in that regard. Furthermore, profiling of our prototype code shows that it is suitable for GPU acceleration, which may further improve its performance in practice. Overall, our results indicate that our method can find solutions of acceptable accuracy in reasonable time, even with a simple Jacobi preconditioner.

National Category
Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-345630 (URN)
Note

QC 20240416

Available from: 2024-04-15 Created: 2024-04-15 Last updated: 2024-04-22Bibliographically approved
6. A GPU Accelerated Interior Point Method for Radiation Therapy Optimization
Open this publication in new window or tab >>A GPU Accelerated Interior Point Method for Radiation Therapy Optimization
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Optimization plays a central role in modern radiation therapy, where it is used to determine optimal treatment machine parameters in order to deliver precise doses adapted to each patient case. In general, solving the optimization problems that arise can present a computational bottleneck in the treatment planning process, as they can be large in terms of both variables and constraints. In this paper, we develop a GPU accelerated optimization solver for radiation therapy applications, based on an interior point method (IPM) utilizing iterative linear algebra to find search directions. The use of iterative linear algebra makes the solver suitable for porting to GPUs, as the core computational kernels become standard matrix-vector or vector-vector operations. Our solver is implemented in C++20 and uses CUDA for GPU acceleration.

The problems we solve are from the commercial treatment planning system RayStation, developed by RaySearch Laboratories (Stockholm, Sweden), which is used clinically in hundreds of cancer clinics around the world. RayStation solves (in general) nonlinear optimization problems using a sequential quadratic programming (SQP) method, where the main computation lies in solving quadratic programming (QP) sub-problems in each iteration. GPU acceleration for the solution of such QP sub-problems is the focus of the interior point method of this work. We benchmark our solver against the existing QP-solver in RayStation and show that our GPU accelerated IPM can accelerate the aggregated time-to-solution for all QP sub-problems in one SQP solve by 1.4 and 4.4 times respectively for two real patient cases.

National Category
Computational Mathematics
Identifiers
urn:nbn:se:kth:diva-345631 (URN)
Note

QC 20240416

Available from: 2024-04-15 Created: 2024-04-15 Last updated: 2024-04-22Bibliographically approved

Open Access in DiVA

kappa(3298 kB)495 downloads
File information
File name FULLTEXT01.pdfFile size 3298 kBChecksum SHA-512
46d126f409d55a1e171409406c64e39a7c4e0d3cfd83a3b9f7a26fc6f67fe4d7d075c6a3ebe011a4e4b0583921819681b865ecea8566c2353806ee4cd29b1e3f
Type fulltextMimetype application/pdf

Authority records

Liu, Felix

Search in DiVA

By author/editor
Liu, Felix
By organisation
Computational Science and Technology (CST)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 495 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1712 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf