kth.sePublications KTH
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-6820-948X
OATML, University of Oxford.ORCID iD: 0000-0001-9944-1129
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control). KTH, School of Electrical Engineering and Computer Science (EECS), Centres, Digital futures.ORCID iD: 0000-0001-9940-5929
OATML, University of Oxford.ORCID iD: 0000-0002-2733-2078
Show others and affiliations
2024 (English)In: International Conference on Machine Learning, ICML 2024, ML Research Press , 2024, p. 23215-23237Conference paper, Published paper (Refereed)
Abstract [en]

Representing uncertainty in causal discovery is a crucial component for experimental design, and more broadly, for safe and reliable causal decision making. Bayesian Causal Discovery (BCD) offers a principled approach to encapsulating this uncertainty. Unlike non-Bayesian causal discovery, which relies on a single estimated causal graph and model parameters for assessment, evaluating BCD presents challenges due to the nature of its inferred quantity - the posterior distribution. As a result, the research community has proposed various metrics to assess the quality of the approximate posterior. However, there is, to date, no consensus on the most suitable metric(s) for evaluation. In this work, we reexamine this question by dissecting various metrics and understanding their limitations. Through extensive empirical evaluation, we find that many existing metrics fail to exhibit a strong correlation with the quality of approximation to the true posterior, especially in scenarios with low sample sizes where BCD is most desirable. We highlight the suitability (or lack thereof) of these metrics under two distinct factors: the identifiability of the underlying causal model and the quantity of available data. Both factors affect the entropy of the true posterior, indicating that the current metrics are less fitting in settings of higher entropy. Our findings underline the importance of a more nuanced evaluation of new methods by taking into account the nature of the true posterior, as well as guide and motivate the development of new evaluation procedures for this challenge.

Place, publisher, year, edition, pages
ML Research Press , 2024. p. 23215-23237
National Category
Biological Sciences
Identifiers
URN: urn:nbn:se:kth:diva-353949Scopus ID: 2-s2.0-85203804991OAI: oai:DiVA.org:kth-353949DiVA, id: diva2:1901025
Conference
41st International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024
Note

QC 20250922

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2025-09-22Bibliographically approved
In thesis
1. Bayesian Causal Discovery and Object-Centric Representations: Challenges and Insights in Structured Learning
Open this publication in new window or tab >>Bayesian Causal Discovery and Object-Centric Representations: Challenges and Insights in Structured Learning
2025 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Causality and Representation Learning are foundational to advancing AI systems capable of reasoning, generalizing, and understanding the complex structure of the world. Causality provides tools to uncover the underlying causal structure of a system, understand cause-effect relationships, and reason about interventions. Representation Learning, on the other hand, transforms raw data into structured abstractions essential for modeling the underlying system and decision-making. Causal Representation Learning bridges these paradigms by using representation learning to extract high-level abstractions and entities and integrating causal reasoning principles to uncover cause-effect relationships between these entities. This approach is crucial for real-world systems, where causal relationships are typically defined between high-level entities, such as objects or interactions, rather than low-level sensory inputs like pixels. This thesis explores two key paradigms presented as a collection of two papers: the challenges in the evaluation of Bayesian Causal Discovery, and the effectiveness of structured representations, with a focus on object-centric representations in visual reasoning.

In the first paper, we study the challenges in the evaluation of Bayesian Causal Discovery methods. By analyzing existing metrics on linear additive noise models, we find that current metrics often fail to correlate with the true posterior in high-entropy settings, such as with limited data or non-identifiable causal models. We highlight the importance of considering posterior entropy and recommend evaluating Bayesian Causal Discovery methods on downstream tasks, such as causal effect estimation, for more meaningful evaluation in such scenarios.

In the second paper, we investigate the effectiveness of object-centric representations in visual reasoning tasks, such as Visual Question Answering. We reveal that while large foundation models often match or surpass object-centric models in performance, they require larger downstream models and more compute due to their less explicit representations. In contrast, object-centric models provide more interpretable representations but face challenges on more complex datasets. Combining object-centric representations with foundation models emerges as a promising solution, reducing computational costs while maintaining high performance. Additionally, we provide several additional insights such as segmentation performance versus downstream performance, and the effect of factors such as dataset size and question types, to further improve our understanding of these models.

Abstract [sv]

Kausalitet och representationsinlärning är grundläggande för att utveckla AI-system som kan resonera, generalisera och förstå världens komplexa strukturer. Kausalitet tillhandahåller verktyg för att avslöja den underliggande kausala strukturen i ett system, förstå orsak-verkan-relationer och resonera kring interventioner. Representationsinlärning, å andra sidan, omvandlar rådata till strukturerade abstraktioner som är avgörande för modellering av det underliggande systemet och beslutsfattande. Kausal representationsinlärning sammanför dessa paradigm genom att använda representationsinlärning för att extrahera högre nivåers abstraktioner och enheter samt integrera principer för kausalt resonemang för att avslöja orsak-verkan-relationer mellan dessa entiteter. Detta tillvägagångssätt är avgörande för verkliga system, där kausala relationer vanligtvis definieras mellan högre nivåers entiteter, såsom objekt eller interaktioner, snarare än lågupplösta sensoriska data som pixlar. Denna avhandling undersöker två centrala paradigm presenterade som en samling av två artiklar: utmaningarna i utvärderingen av Bayesiansk kausal upptäckning och effektiviteten av strukturerade representationer, med fokus på objektcentrerade representationer inom visuellt resonemang.

I den första artikeln studerar vi utmaningarna i utvärderingen av metoder för Bayesiansk kausal upptäckning. Genom att analysera befintliga mått på linjära additiva brusmodeller finner vi att nuvarande metoder ofta misslyckas med att korrelera med den sanna posteriorn i högentropiska inställningar, såsom vid begränsad data eller icke-identifierbara kausala modeller. Vi framhäver vikten av att beakta posteriorns entropi och rekommenderar att Bayesiansk kausal upptäckning-metoder utvärderas på nedströmsuppgifter, såsom orsakseffektsberäkning, för att uppnå en mer meningsfull utvärdering i sådana scenarier.

I den andra artikeln undersöker vi effektiviteten av objektcentrerade representationer i visuella resonemangsuppgifter, såsom Visual Question Answering. Vi avslöjar att även om stora grundmodeller ofta kan matcha eller överträffa objektcentrerade-modeller i prestanda, kräver de större nedströmsmodeller och mer beräkningskraft på grund av deras mindre explicita representationer. I kontrast erbjuder objektcentrerade-modeller mer tolkningsbara representationer men möter utmaningar på mer komplexa datamängder. Att kombinera objektcentrerade-representationer med grundmodeller framstår som en lovande lösning, eftersom det minskar beräkningskostnaderna samtidigt som hög prestanda bibehålls. Dessutom presenterar vi flera ytterligare insikter, såsom sambandet mellan segmenteringsprestanda och nedströmsprestanda samt effekten av faktorer som datasetstorlek och frågetyper, för att ytterligare förbättra vår förståelse av dessa modeller.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2025. p. vii, 53
Series
TRITA-EECS-AVL ; 2025:19
Keywords
Causality, Bayesian Causal Discovery, Representation Learning, Object-Centric Learning, Kausalitet, Bayesiansk Kausal Upptäckt, Representationslärande, Objektcentriskt Lärande
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:kth:diva-359733 (URN)978-91-8106-191-8 (ISBN)
Presentation
2025-03-07, https://kth-se.zoom.us/j/68284213723, E3, Rum 1563, Osquars backe 18, KTH Campus, Stockholm, 10:00 (English)
Opponent
Supervisors
Funder
Wallenberg AI, Autonomous Systems and Software Program (WASP), 30007
Note

QC 20250212

Available from: 2025-02-12 Created: 2025-02-10 Last updated: 2025-11-04Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Mamaghan, Amir Mohammad KarimiJohansson, Karl H.Bauer, Stefan

Search in DiVA

By author/editor
Mamaghan, Amir Mohammad KarimiTigas, PanagiotisJohansson, Karl H.Gal, YarinBauer, Stefan
By organisation
Decision and Control Systems (Automatic Control)Digital futures
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 92 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf