kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Distance Covariance-based Kernel for Nonlinear Causal Clustering in Heterogeneous Populations
KTH, School of Engineering Sciences (SCI), Mathematics (Dept.), Mathematics of Data and AI. Research Group Neuroinformatics, Faculty of Computer Science, University of Vienna.ORCID iD: 0000-0002-5495-1077
Department of Computer Science and Engineering, Indian Institute of Technology Bombay.
Research Group Neuroinformatics, Faculty of Computer Science, University of Vienna, Research Group Neuroinformatics, Faculty of Computer Science, University of Vienna; Research Platform Data Science @ Uni Vienna, Research Platform Data Science @ Uni Vienna; Vienna Cognitive Science Hub, Vienna Cognitive Science Hub.
2022 (English)In: Proceedings of the 1st Conference on Causal Learning and Reasoning, CLeaR 2022, ML Research Press , 2022, p. 542-558Conference paper, Published paper (Refereed)
Abstract [en]

We consider the problem of causal structure learning in the setting of heterogeneous populations, i.e., populations in which a single causal structure does not adequately represent all population members, as is common in biological and social sciences. To this end, we introduce a distance covariance-based kernel designed specifically to measure the similarity between the underlying nonlinear causal structures of different samples. Indeed, we prove that the corresponding feature map is a statistically consistent estimator of nonlinear independence structure, rendering the kernel itself a statistical test for the hypothesis that sets of samples come from different generating causal structures. Even stronger, we prove that the kernel space is isometric to the space of causal ancestral graphs, so that distance between samples in the kernel space is guaranteed to correspond to distance between their generating causal structures. This kernel thus enables us to perform clustering to identify the homogeneous subpopulations, for which we can then learn causal structures using existing methods. Though we focus on the theoretical aspects of the kernel, we also evaluate its performance on synthetic data and demonstrate its use on a real gene expression data set.

Place, publisher, year, edition, pages
ML Research Press , 2022. p. 542-558
Keywords [en]
clustering, distance covariance, graphical causal models, whole-graph embeddings
National Category
Probability Theory and Statistics Signal Processing
Identifiers
URN: urn:nbn:se:kth:diva-335773Scopus ID: 2-s2.0-85140201859OAI: oai:DiVA.org:kth-335773DiVA, id: diva2:1795477
Conference
1st Conference on Causal Learning and Reasoning, CLeaR 2022, Eureka, United States of America, Apr 11 2022 - Apr 13 2022
Note

QC 20230908

Available from: 2023-09-08 Created: 2023-09-08 Last updated: 2023-09-08Bibliographically approved

Open Access in DiVA

No full text in DiVA

Scopus

Authority records

Markham, Alex

Search in DiVA

By author/editor
Markham, Alex
By organisation
Mathematics of Data and AI
Probability Theory and StatisticsSignal Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 48 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf