kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Dynamically Automated Pruning of Universal Probabilistic Programming Languages
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-9703-6912
Swedish Museum of Natural History, Stockholm, Sweden.ORCID iD: 0000-0002-1513-1674
Swedish Museum of Natural History, Stockholm, Sweden.ORCID iD: 0000-0002-3929-251X
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0001-8457-4105
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Probabilistic programming languages aim to separate the user-specified model from the inference allowing users to focus on model design and leave inference to the compiler and runtime systems. However, the structure of the model affects the inference efficiency. We can utilize specific structures in the model to improve inference efficiency, such as random variables forming trees. In this work, we propose pruning, an approach that automates the forward pass of belief propagation to improve the efficiency of likelihood calculations in statically typed universal probabilistic programming languages (PPLs) by utilizing the tree structures within a model. Specifically, users annotate variables for marginalization, making the likelihood calculation more efficient. We present our method implemented in the probabilistic programming language Miking CorePPL and demonstrate the performance of our approach through a series of case studies in phylogenetics.

National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-353281OAI: oai:DiVA.org:kth-353281DiVA, id: diva2:1897767
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

Submitted for publication

QC 20240918

Available from: 2024-09-15 Created: 2024-09-15 Last updated: 2024-09-18Bibliographically approved
In thesis
1. Automated Optimizations for Inference in Probabilistic Programming Languages
Open this publication in new window or tab >>Automated Optimizations for Inference in Probabilistic Programming Languages
2024 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Probabilistic programming languages (PPLs) provide users with an interface to write probabilistic models and leave the statistical inference to the compiler and runtime systems. Ideally, the purpose of PPLs is to provide a seamless inference interface that makes the model and the inference parts modular. Yet, the structure of the user-defined model affects inference accuracy and execution time. Users can utilize certain structures, such as conjugate prior relations or tree structures within a given model, to improve inference efficiency; however, manually utilizing these structures in a model is time-consuming and error-prone. Especially as the PPL's expressiveness increases, the models may become too complex to rewrite. This thesis tackles the problems related to the automation of optimizations, improving the execution time and the accuracy of the inference based on the structure of the model. Further, it aims to make these optimizations expressive enough to handle complex models written in universal PPLs. 

The scope of this thesis is to optimize Bayesian inference in universal PPLs. The three contributions presented propose automated optimization approaches to improving the inference accuracy and execution time while preserving the expressiveness of the optimized program. The first contribution considers dynamic delayed sampling, a runtime algorithm to reduce inference variance through analytical relationships between random variables. We develop a compile-time version of the delayed sampling algorithm to reduce execution overhead. The second contribution regards the design and implementation of dynamic delayed sampling in a statically typed universal PPL, Miking CorePPL. Integrating delayed sampling into a statically typed PPL requires introducing additional constructs to the language. The third contribution is related to the forward pass of belief propagation algorithm, an efficient inference algorithm on tree graphs. We propose an approach automating this forward pass of belief propagation in statically typed universal PPLs to optimize likelihood calculations by utilizing the tree structures in a model. We evaluate the proposed approaches on real-world problems, such as topic modeling and phylogenetic problems.

Abstract [sv]

Probabilistiska programmeringsspråk (PPS) ger användare ett gränssnitt för att skriva probabilistiska modeller och överlåta den statistiska inferensen till kompilatorn och runtime-systemen. Idealiskt sett är syftet med PPS att erbjuda ett sömlöst inferencesgränssnitt som gör modell- och inferensdelarna modulära. Dock påverkas inferensens noggrannhet och exekveringstid av strukturen hos den användardefinierade modellen. Användare kan utnyttja vissa strukturer, såsom konjugerade prior-relationer eller trädsstrukturer inom en given modell, för att förbättra inferensens effektivitet; men att manuellt använda dessa strukturer i en modell är tidskrävande och felbenäget. Speciellt när PPL uttrycksförmåga ökar, kan modellerna bli för komplexa för att skrivas om på detta sätt. Denna avhandling tar itu med problemen relaterade till automatisering av optimeringar, förbättring av exekveringstid och noggrannheten hos inferens baserat på modellens struktur. Vidare fokuseras hur dessa optimeringar kan göras tillräckligt uttrycksfulla för att hantera komplexa modeller skrivna i universella PPS.

Omfattningen av denna avhandling är att optimera Bayesiansk inferens i universella PPS. De tre bidragen som presenteras föreslår automatiserade optimeringsmetoder för att förbättra inferensens noggrannhet och exekveringstid samtidigt som uttrycksfullheten hos det optimerade programmet bevaras. Det första bidraget behandlar dynamisk fördröjd sampling, en runtime-algoritm för att minska inferensens varians genom att använda analytiska relationer mellan slumpmässiga variabler. Vi utvecklar en kompileringstid-version av den fördröjda samplingalgoritmen för att minska exekveringsomkostnad. Det andra bidraget gäller design och implementering av dynamisk fördröjd sampling i Miking CorePPL, som är ett statiskt typat universellt PPS. Att integrera fördröjd sampling i ett statiskt typat PPS kräver införande av ytterligare konstruktioner i språket. Det tredje bidraget är relaterat till framåtpasset av belief propagation-algoritmen, en effektiv inferensalgoritm på trädgrafer. Vi föreslår en metod för att automatisera detta framåtpass av belief propagation i statiskt typade universella PPS för att optimera sannolikhetsberäkningar genom att utnyttja trädsstrukturer i en modell. Vi utvärderar de föreslagna metoderna på verkliga problem, såsom ämnesmodellering och fylogenetiska problem.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2024. p. vii, 29
Series
TRITA-EECS-AVL ; 2024:73
Keywords
PPL, Bayesian, Inference, Efficiency, PPS, Bayesian, Inferensen, Effektiv
National Category
Computer and Information Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-353310 (URN)978-91-8106-061-4 (ISBN)
Presentation
2024-10-14, https://kth-se.zoom.us/j/66846620144, Room Ada, Electrum, Kistagangen 16, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP)
Note

QC 20240918

Available from: 2024-09-18 Created: 2024-09-16 Last updated: 2024-09-23Bibliographically approved

Open Access in DiVA

fulltext(4533 kB)74 downloads
File information
File name FULLTEXT01.pdfFile size 4533 kBChecksum SHA-512
7424232132576cacd6b5ac504c8c8018ba23dfadffcebf5a96a88b6b9a8fcabf02cba605a8fb35991757ac2d3f2a98fbd50da7e30e6ce9eac65e919a39e1a006
Type fulltextMimetype application/pdf

Authority records

Çaylak, GizemBroman, David

Search in DiVA

By author/editor
Çaylak, GizemGranqvist, EmmaRonquist, FredrikBroman, David
By organisation
Software and Computer systems, SCS
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 74 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 153 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf