kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Multiset correlation and factor analysis enables exploration of multi-omics data
New York Genome Center, New York, NY, USA; Data Science Institute, Columbia University, New York, NY, USA.
New York Genome Center, New York, NY, USA; Department of Computer Science, Columbia University, New York, NY, USA.
New York Genome Center, New York, NY, USA; Department of Systems Biology, Columbia University, New York, NY, USA.
Illumina Incorporated, San Francisco, CA, USA; The Broad Institute of MIT and Harvard, Boston, MA, USA.
Show others and affiliations
2023 (English)In: Cell Genomics, E-ISSN 2666-979X, Vol. 3, no 8, p. 100359-, article id 100359Article in journal (Refereed) Published
Abstract [en]

Multi-omics datasets are becoming more common, necessitating better integration methods to realize their revolutionary potential. Here, we introduce multi-set correlation and factor analysis (MCFA), an unsupervised integration method tailored to the unique challenges of high-dimensional genomics data that enables fast inference of shared and private factors. We used MCFA to integrate methylation markers, protein expression, RNA expression, and metabolite levels in 614 diverse samples from the Trans-Omics for Precision Medicine/Multi-Ethnic Study of Atherosclerosis multi-omics pilot. Samples cluster strongly by ancestry in the shared space, even in the absence of genetic information, while private spaces frequently capture dataset-specific technical variation. Finally, we integrated genetic data by conducting a genome-wide association study (GWAS) of our inferred factors, observing that several factors are enriched for GWAS hits and trans-expression quantitative trait loci. Two of these factors appear to be related to metabolic disease. Our study provides a foundation and framework for further integrative analysis of ever larger multi-modal genomic datasets.

Place, publisher, year, edition, pages
Elsevier BV , 2023. Vol. 3, no 8, p. 100359-, article id 100359
National Category
Bioinformatics and Computational Biology Genetics and Genomics Medical Genetics and Genomics
Identifiers
URN: urn:nbn:se:kth:diva-337432DOI: 10.1016/j.xgen.2023.100359ISI: 001103153500001Scopus ID: 2-s2.0-85171660962OAI: oai:DiVA.org:kth-337432DiVA, id: diva2:1801924
Note

QC 20231003

Available from: 2023-10-03 Created: 2023-10-03 Last updated: 2025-02-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Lappalainen, Tuuli

Search in DiVA

By author/editor
Lappalainen, Tuuli
By organisation
Gene TechnologyScience for Life Laboratory, SciLifeLab
In the same journal
Cell Genomics
Bioinformatics and Computational BiologyGenetics and GenomicsMedical Genetics and Genomics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 86 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf