kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (10 of 59) Show all publications
Li, T., Yang, Q., Acs, B., Sifakis, E. G., Toosi, H., Engblom, C., . . . Hartman, J. (2025). Computational pathology annotation enhances the resolution and interpretation of breast cancer spatial transcriptomics data. npj Precision Oncology, 9(1), Article ID 310.
Open this publication in new window or tab >>Computational pathology annotation enhances the resolution and interpretation of breast cancer spatial transcriptomics data
Show others...
2025 (English)In: npj Precision Oncology, E-ISSN 2397-768X, Vol. 9, no 1, article id 310Article in journal (Refereed) Published
Abstract [en]

Breast cancer is a highly heterogeneous disease with diverse outcomes, and intra-tumoral heterogeneity plays a significant role in both diagnosis and treatment. Despite its importance, the spatial distribution of intra-tumoral heterogeneity is not fully elucidated. Spatial transcriptomics has emerged as a promising tool to study the molecular mechanisms behind many diseases. It offers accurate measurements of RNA abundance, providing powerful tools to correlate the morphologies of cellular neighborhoods with localized gene expression patterns. However, the spot-based spatial transcriptomic tools, including the most widely used platform, Visium, do not achieve single-cell resolution readouts, which hinders data interpretability. In this study, we present a computational pathology image analysis pipeline (i.e., computational tissue annotation, CTA) that utilizes machine learning algorithms to accurately map tumor, stroma, and immune compartments within Visium-assayed tumor sections. Using a cohort of 23 breast tumor sections from four patients, we demonstrate that CTA can provide high-resolution annotations on the hematoxylin-and-eosin-stained images alongside the paired sequencing data, support the evaluation of deconvolution methods, deepen insights into intra-tumoral heterogeneity by increasing data analysis resolution, assist with spatially resolved intrinsic subtyping, and enhance the visualization of lymphocyte clones at single-cell resolution. The proposed pipeline provides valuable insights into the complex spatial architecture of breast cancer, contributing to more personalized diagnostics and treatment strategies.

Place, publisher, year, edition, pages
Springer Nature, 2025
National Category
Cancer and Oncology Cell and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-370404 (URN)10.1038/s41698-025-01104-3 (DOI)001566906400001 ()40925915 (PubMedID)2-s2.0-105015372811 (Scopus ID)
Note

QC 20250926

Available from: 2025-09-26 Created: 2025-09-26 Last updated: 2025-09-26Bibliographically approved
Zampinetti, V., Melin, H. & Lagergren, J. (2025). Sampling random spanning arborescences in graphs with low conductance. Statistics and Probability Letters, 226, Article ID 110481.
Open this publication in new window or tab >>Sampling random spanning arborescences in graphs with low conductance
2025 (English)In: Statistics and Probability Letters, ISSN 0167-7152, E-ISSN 1879-2103, Vol. 226, article id 110481Article in journal (Refereed) Published
Abstract [en]

Sampling random spanning arborescences in directed graphs is critical for applications in network analysis, optimization, and machine learning. While many state-of-the-art methods perform well on graphs with high conductance, they often fail or generalize poorly on low-conductance graphs. Inspired by Wilson's algorithm, we propose a novel sampling approach that overcomes this limitation by using dynamic programming to compute random walk probabilities. This avoids both inefficient walk simulations and numerically unstable Laplacian determinant calculations. Our method demonstrates superior efficiency and sampling quality in simulations, and is the only one to handle low-conductance graphs effectively.

Place, publisher, year, edition, pages
Elsevier BV, 2025
Keywords
Bayesian inference, Random tree sampling, Random walk, Wilson's algorithm
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-368753 (URN)10.1016/j.spl.2025.110481 (DOI)001521395300001 ()2-s2.0-105008892340 (Scopus ID)
Note

QC 20250821

Available from: 2025-08-21 Created: 2025-08-21 Last updated: 2025-10-03Bibliographically approved
Hotti, A., Van der Goten, L. & Lagergren, J. (2024). Benefits of Non-Linear Scale Parameterizations in Black Box Variational Inference through Smoothness Results and Gradient Variance Bounds. In: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024: . Paper presented at 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024, Valencia, Spain, May 2 2024 - May 4 2024 (pp. 3538-3546). ML Research Press, 238
Open this publication in new window or tab >>Benefits of Non-Linear Scale Parameterizations in Black Box Variational Inference through Smoothness Results and Gradient Variance Bounds
2024 (English)In: Proceedings of the 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024, ML Research Press , 2024, Vol. 238, p. 3538-3546Conference paper, Published paper (Refereed)
Abstract [en]

Black box variational inference has consistently produced impressive empirical results. Convergence guarantees require that the variational objective exhibits specific structural properties and that the noise of the gradient estimator can be controlled. In this work we study the smoothness and the variance of the gradient estimator for location-scale variational families with non-linear covariance parameterizations. Specifically, we derive novel theoretical results for the popular exponential covariance parameterization and tighter gradient variance bounds for the softplus parameterization. These results reveal the benefits of using non-linear scale parameterizations on large scale datasets. With a non-linear scale parameterization, the smoothness constant of the variational objective and the upper bound on the gradient variance decrease as the scale parameter becomes smaller. Learning posterior approximations with small scales is essential in Bayesian statistics with sufficient amount of data, since under appropriate assumptions, the posterior distribution is known to contract around the parameter of interest as the sample size increases. We validate our theoretical findings through empirical analysis on several large-scale datasets, underscoring the importance of non-linear parameterizations.

Place, publisher, year, edition, pages
ML Research Press, 2024
Series
Proceedings of Machine Learning Research, ISSN 2640-3498 ; 238
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:kth:diva-347320 (URN)001286500303007 ()2-s2.0-85194189884 (Scopus ID)
Conference
27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024, Valencia, Spain, May 2 2024 - May 4 2024
Note

QC 20241213

Available from: 2024-06-10 Created: 2024-06-10 Last updated: 2025-05-20Bibliographically approved
Mold, J. E., Weissman, M. H., Ratz, M., Hagemann-Jensen, M., Hård, J., Eriksson, C. J., . . . Frisén, J. (2024). Clonally heritable gene expression imparts a layer of diversity within cell types. Cell systems, 15(2), 149
Open this publication in new window or tab >>Clonally heritable gene expression imparts a layer of diversity within cell types
Show others...
2024 (English)In: Cell systems, E-ISSN 2405-4720, Vol. 15, no 2, p. 149-Article in journal (Refereed) Published
Abstract [en]

Cell types can be classified according to shared patterns of transcription. Non-genetic variability among individual cells of the same type has been ascribed to stochastic transcriptional bursting and transient cell states. Using high-coverage single-cell RNA profiling, we asked whether long-term, heritable differences in gene expression can impart diversity within cells of the same type. Studying clonal human lymphocytes and mouse brain cells, we uncovered a vast diversity of heritable gene expression patterns among different clones of cells of the same type in vivo. We combined chromatin accessibility and RNA profiling on different lymphocyte clones to reveal thousands of regulatory regions exhibiting interclonal variation, which could be directly linked to interclonal variation in gene expression. Our findings identify a source of cellular diversity, which may have important implications for how cellular populations are shaped by selective processes in development, aging, and disease. A record of this paper's transparent peer review process is included in the supplemental information.

Place, publisher, year, edition, pages
Elsevier BV, 2024
Keywords
clonality, epigenetics, gene expression regulation, heritability, immunology, lineage tracing, memory, neuroscience, RNA-seq, single cell
National Category
Medical Genetics and Genomics
Identifiers
urn:nbn:se:kth:diva-344172 (URN)10.1016/j.cels.2024.01.004 (DOI)001197740700001 ()38340731 (PubMedID)2-s2.0-85185847086 (Scopus ID)
Note

QC 20240308

Available from: 2024-03-06 Created: 2024-03-06 Last updated: 2025-12-05Bibliographically approved
Safinianaini, N., De Souza, C. P. .., Roth, A., Koptagel, H., Toosi, H. & Lagergren, J. (2024). CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference. Computational biology and chemistry (Print), 113, Article ID 108257.
Open this publication in new window or tab >>CopyMix: Mixture model based single-cell clustering and copy number profiling using variational inference
Show others...
2024 (English)In: Computational biology and chemistry (Print), ISSN 1476-9271, E-ISSN 1476-928X, Vol. 113, article id 108257Article in journal (Refereed) Published
Abstract [en]

Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.

Place, publisher, year, edition, pages
Elsevier Ltd, 2024
Keywords
Cancer, Copy number profiling, Mixture models, Single-cell, Tumor clonal decomposition, Variational inference
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:kth:diva-356294 (URN)10.1016/j.compbiolchem.2024.108257 (DOI)001350559200001 ()39500117 (PubMedID)2-s2.0-85208042394 (Scopus ID)
Note

QC 20241114

Available from: 2024-11-13 Created: 2024-11-13 Last updated: 2025-12-05Bibliographically approved
Kurt, S., Chen, M., Toosi, H., Chen, X., Engblom, C., Mold, J., . . . Lagergren, J. (2024). CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics. Bioinformatics, 40(5), Article ID btae284.
Open this publication in new window or tab >>CopyVAE: a variational autoencoder-based approach for copy number variation inference using single-cell transcriptomics
Show others...
2024 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 40, no 5, article id btae284Article in journal (Refereed) Published
Abstract [en]

Motivation: Copy number variations (CNVs) are common genetic alterations in tumour cells. The delineation of CNVs holds promise for enhancing our comprehension of cancer progression. Moreover, accurate inference of CNVs from single-cell sequencing data is essential for unravelling intratumoral heterogeneity. However, existing inference methods face limitations in resolution and sensitivity. Results: To address these challenges, we present CopyVAE, a deep learning framework based on a variational autoencoder architecture. Through experiments, we demonstrated that CopyVAE can accurately and reliably detect CNVs from data obtained using single-cell RNA sequencing. CopyVAE surpasses existing methods in terms of sensitivity and specificity. We also discussed CopyVAE’s potential to advance our understanding of genetic alterations and their impact on disease advancement. Availability and implementation: CopyVAE is implemented and freely available under MIT license at https://github.com/kurtsemih/copyVAE.

Place, publisher, year, edition, pages
Oxford University Press, 2024
National Category
Biological Sciences
Identifiers
urn:nbn:se:kth:diva-346814 (URN)10.1093/bioinformatics/btae284 (DOI)001217927500002 ()38676578 (PubMedID)2-s2.0-85192946770 (Scopus ID)
Note

QC 20240524

Available from: 2024-05-24 Created: 2024-05-24 Last updated: 2024-11-27Bibliographically approved
Hotti, A., Kviman, O., Molén, R., Elvira, V. & Lagergren, J. (2024). Efficient mixture learning in black-box variational inference. In: International Conference on Machine Learning, ICML 2024: . Paper presented at 41st International Conference on Machine Learning, ICML 2024, July 21-27, 2024, Vienna, Austria (pp. 18972-18991). ML Research Press
Open this publication in new window or tab >>Efficient mixture learning in black-box variational inference
Show others...
2024 (English)In: International Conference on Machine Learning, ICML 2024, ML Research Press , 2024, p. 18972-18991Conference paper, Published paper (Refereed)
Abstract [en]

Mixture variational distributions in black box variational inference (BBVI) have demonstrated impressive results in challenging density estimation tasks. However, currently scaling the number of mixture components can lead to a linear increase in the number of learnable parameters and a quadratic increase in inference time due to the evaluation of the evidence lower bound (ELBO). Our two key contributions address these limitations. First, we introduce the novel Multiple Importance Sampling Variational Autoencoder (MISVAE), which amortizes the mapping from input to mixture-parameter space using one-hot encodings. Fortunately, with MISVAE, each additional mixture component incurs a negligible increase in network parameters. Second, we construct two new estimators of the ELBO for mixtures in BBVI, enabling a tremendous reduction in inference time with marginal or even improved impact on performance. Collectively, our contributions enable scalability to hundreds of mixture components and provide superior estimation performance in shorter time, with fewer network parameters compared to previous Mixture VAEs. Experimenting with MISVAE, we achieve astonishing, SOTA results on MNIST. Furthermore, we empirically validate our estimators in other BBVI settings, including Bayesian phylogenetic inference, where we improve inference times for the SOTA mixture model on eight data sets.

Place, publisher, year, edition, pages
ML Research Press, 2024
National Category
Computer and Information Sciences Mathematics
Identifiers
urn:nbn:se:kth:diva-353950 (URN)2-s2.0-85203836475 (Scopus ID)
Conference
41st International Conference on Machine Learning, ICML 2024, July 21-27, 2024, Vienna, Austria
Note

QC 20240926

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2025-05-20Bibliographically approved
Nilsson, A., Wijk, K., Gutha, S. b., Englesson, E., Hotti, A., Saccardi, C., . . . Azizpour, H. (2024). Indirectly Parameterized Concrete Autoencoders. In: International Conference on Machine Learning, ICML 2024: . Paper presented at 41st International Conference on Machine Learning, ICML 2024, Vienna, Austria, Jul 21 2024 - Jul 27 2024 (pp. 38237-38252). ML Research Press
Open this publication in new window or tab >>Indirectly Parameterized Concrete Autoencoders
Show others...
2024 (English)In: International Conference on Machine Learning, ICML 2024, ML Research Press , 2024, p. 38237-38252Conference paper, Published paper (Refereed)
Abstract [en]

Feature selection is a crucial task in settings where data is high-dimensional or acquiring the full set of features is costly. Recent developments in neural network-based embedded feature selection show promising results across a wide range of applications. Concrete Autoencoders (CAEs), considered state-of-the-art in embedded feature selection, may struggle to achieve stable joint optimization, hurting their training time and generalization. In this work, we identify that this instability is correlated with the CAE learning duplicate selections. To remedy this, we propose a simple and effective improvement: Indirectly Parameterized CAEs (IP-CAEs). IP-CAEs learn an embedding and a mapping from it to the Gumbel-Softmax distributions' parameters. Despite being simple to implement, IP-CAE exhibits significant and consistent improvements over CAE in both generalization and training time across several datasets for reconstruction and classification. Unlike CAE, IP-CAE effectively leverages non-linear relationships and does not require retraining the jointly optimized decoder. Furthermore, our approach is, in principle, generalizable to Gumbel-Softmax distributions beyond feature selection.

Place, publisher, year, edition, pages
ML Research Press, 2024
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-353956 (URN)2-s2.0-85203808876 (Scopus ID)
Conference
41st International Conference on Machine Learning, ICML 2024, Vienna, Austria, Jul 21 2024 - Jul 27 2024
Note

QC 20240926

Available from: 2024-09-25 Created: 2024-09-25 Last updated: 2024-09-26Bibliographically approved
Shafighi, S., Geras, A., Jurzysta, B., Sahaf Naeini, A., Filipiuk, I., Ra̧czkowska, A., . . . Szczurek, E. (2024). Integrative spatial and genomic analysis of tumor heterogeneity with Tumoroscope. Nature Communications, 15(1), Article ID 9343.
Open this publication in new window or tab >>Integrative spatial and genomic analysis of tumor heterogeneity with Tumoroscope
Show others...
2024 (English)In: Nature Communications, E-ISSN 2041-1723, Vol. 15, no 1, article id 9343Article in journal (Refereed) Published
Abstract [en]

Spatial and genomic heterogeneity of tumors are crucial factors influencing cancer progression, treatment, and survival. However, a technology for direct mapping the clones in the tumor tissue based on somatic point mutations is lacking. Here, we propose Tumoroscope, the first probabilistic model that accurately infers cancer clones and their localization in close to single-cell resolution by integrating pathological images, whole exome sequencing, and spatial transcriptomics data. In contrast to previous methods, Tumoroscope explicitly addresses the problem of deconvoluting the proportions of clones in spatial transcriptomics spots. Applied to a reference prostate cancer dataset and a newly generated breast cancer dataset, Tumoroscope reveals spatial patterns of clone colocalization and mutual exclusion in sub-areas of the tumor tissue. We further infer clone-specific gene expression levels and the most highly expressed genes for each clone. In summary, Tumoroscope enables an integrated study of the spatial, genomic, and phenotypic organization of tumors.

Place, publisher, year, edition, pages
Springer Nature, 2024
National Category
Cancer and Oncology Cell and Molecular Biology
Identifiers
urn:nbn:se:kth:diva-356317 (URN)10.1038/s41467-024-53374-3 (DOI)001367220500035 ()39472583 (PubMedID)2-s2.0-85208162192 (Scopus ID)
Note

Correction in DOI 10.1038/s41467-025-58177-8

QC 20250217

Available from: 2024-11-13 Created: 2024-11-13 Last updated: 2025-04-03Bibliographically approved
Koptagel, H., Jun, S. H., Hård, J. & Lagergren, J. (2024). Scuphr: A probabilistic framework for cell lineage tree reconstruction. PloS Computational Biology, 20(5 May), Article ID e1012094.
Open this publication in new window or tab >>Scuphr: A probabilistic framework for cell lineage tree reconstruction
2024 (English)In: PloS Computational Biology, ISSN 1553-734X, E-ISSN 1553-7358, Vol. 20, no 5 May, article id e1012094Article in journal (Refereed) Published
Abstract [en]

Cell lineage tree reconstruction methods are developed for various tasks, such as investigating the development, differentiation, and cancer progression. Single-cell sequencing technologies enable more thorough analysis with higher resolution. We present Scuphr, a distance-based cell lineage tree reconstruction method using bulk and single-cell DNA sequencing data from healthy tissues. Common challenges of single-cell DNA sequencing, such as allelic dropouts and amplification errors, are included in Scuphr. Scuphr computes the distance between cell pairs and reconstructs the lineage tree using the neighbor-joining algorithm. With its embarrassingly parallel design, Scuphr can do faster analysis than the state-of-the-art methods while obtaining better accuracy. The method’s robustness is investigated using various synthetic datasets and a biological dataset of 18 cells.

Place, publisher, year, edition, pages
Public Library of Science (PLoS), 2024
National Category
Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:kth:diva-346807 (URN)10.1371/journal.pcbi.1012094 (DOI)001219374600002 ()38723024 (PubMedID)2-s2.0-85193031437 (Scopus ID)
Note

QC 20240524

Available from: 2024-05-24 Created: 2024-05-24 Last updated: 2025-02-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4552-0240

Search in DiVA

Show all publications