kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A simple null model for inferences from network enrichment analysis
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab.
KTH, Centres, Science for Life Laboratory, SciLifeLab. KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology.ORCID iD: 0000-0001-5689-9797
2018 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 13, no 11, article id e0206864Article in journal (Refereed) Published
Abstract [en]

A prevailing technique to infer function from lists of identifications, from molecular biological high-throughput experiments, is over-representation analysis, where the identifications are compared to predefined sets of related genes often referred to as pathways. As at least some pathways are known to be incomplete in their annotation, algorithmic efforts have been made to complement them with information from functional association networks. While the terminology varies in the literature, we will here refer to such methods as Network Enrichment Analysis (NEA). Traditionally, the significance of inferences from NEA has been assigned using a null model constructed from randomizations of the network. Here we instead argue for a null model that more directly relates to the set of genes being studied, and have designed one dynamic programming algorithm that calculates the score distribution of NEA scores that makes it possible to assign unbiased mid p values to inferences. We also implemented a random sampling method, carrying out the same task. We demonstrate that our method obtains a superior statistical calibration as compared to the popular NEA inference engine, BinoX, while also providing statistics that are easier to interpret.

Place, publisher, year, edition, pages
PUBLIC LIBRARY SCIENCE , 2018. Vol. 13, no 11, article id e0206864
National Category
Genetics
Identifiers
URN: urn:nbn:se:kth:diva-239780DOI: 10.1371/journal.pone.0206864ISI: 000449772600027PubMedID: 30412619Scopus ID: 2-s2.0-85056317407OAI: oai:DiVA.org:kth-239780DiVA, id: diva2:1276678
Note

QC 20190108

Available from: 2019-01-08 Created: 2019-01-08 Last updated: 2024-03-18Bibliographically approved
In thesis
1. Pathway analysis: methods and perspectives
Open this publication in new window or tab >>Pathway analysis: methods and perspectives
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The amount of data being generated by high throughput molecular biology experiments grows every day, both in quantity and quality. With this comes the desire to have more powerful and comprehensive methods for statistical analysis that have been developed with the nature of this data in mind.

One of the lines of research that has been developed with this specific goal in mind is pathway analysis. Here, pathways are units of information that have been curated in a way that makes biological knowledge of cellular processes available in a programmatic way, and pathway analysis methods make use of this information to help understand the results of high throughput experiments.

This is an exploratory thesis on the field of pathway analysis. I give a brief introduction to the field, what motivated its development, the problems it tries to solve, and some of the proposed statistical methods, together with some discussion on the implications of this type of analysis.

I then present three original works on pathway analysis, each with a different perspective on the task. First, we present a more reliable null model for pathway analysis methods that use functional association networks, which results in better-calibrated statistics. Second, we show how we can combine pathway analysis methods with other statistical methods, such as survival analysis. We applied this method to a large breast cancer cohort and show that in this case pathways provide better prognostic power than individual genes. Third, we leverage concepts from information theory to design an original pathway analysis method that is very sensitive and flexible, while being practically without parameters. Together, all three papers contribute to furthering the field's usefulness and to the understanding of this type of analysis. 

Abstract [sv]

Mängden data som genereras i storskaliga molekylärbiologiska experiment ökar stadigt, både i kvantitet och kvalitet. Som en konsekvens ökar behovet av kraftfullare och mer omfattande metoder för tolkning och statistisk analys av sådan data.

En forskningsmetodik som försöker lösa problem associerade med den statistiska analysen utav stora blandade biologiska datamängder är pathway-analys (från engelskans pathway; gångväg eller sekvens av steg). En biologisk eller biomedicinsk pathway är en enhet av annoterad information, som har kurerats på ett sådant sätt att den representerar tidigare biologisk kunskap. Den programmatiskt tillgängliga informationen över rimliga kopplingar i den stora datamängden kan innefatta metabola processer, cellulär lokalisering eller biokemisk funktion. Den stora mängden pathways möjliggör sedan systematisk dataintegrering och ökad förståelse utav stora datamängder från hög-kapacitets experiment.

I denna avhandling beskriver vi pathway-analys genom att först ge en kort introduktion till teknikerna, vad som motiverade dess utveckling, de problem pathway-analys försöker lösa och några av de föreslagna statistiska metoderna, tillsammans med en del diskussion om implikationerna av denna typ av analys.

Jag presenterar sedan tre publikationer om pathway-analys, var och en med olika perspektiv på uppgiften. Först presenterar vi en mer tillförlitlig, graf baserad, statistisk null-modell för pathway-analysmetoder som bygger på funktionella associationsnätverk, vilket resulterar i bättre kalibrerad statistik. I den andra artikeln visar vi hur vi kan kombinera pathway-analysmetodik med andra statistiska metoder, såsom överlevnadsanalys. Vi tillämpade denna metod på en stor bröstcancerkohort och visar att i detta fall ger pathways bättre prognostisk kraft än enskilda gener. I den tredje artikeln utnyttjar vi begrepp från informationsteori för att designa en förbättrad pathway-analysmetodik, som är mycket känslig och flexibel, samtidigt som den är praktiskt taget utan parametrar. Tillsammans bidrar alla tre artiklarna till att öka fältets användbarhet och förståelsen för denna typ av analys.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2022. p. 53
Series
TRITA-CBH-FOU ; 2022:41
Keywords
pathway analysis, mutual information, survival analysis, enrichment analysis, transcriptomics.
National Category
Bioinformatics (Computational Biology)
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-316604 (URN)978-91-8040-322-1 (ISBN)
Public defence
2022-09-28, Air & Fire, Tomtebodavägen 23A, via Zoom: https://kth-se.zoom.us/j/61760412942, Solna, 14:00 (English)
Opponent
Supervisors
Note

QC 2022-08-25

Available from: 2022-08-25 Created: 2022-08-24 Last updated: 2022-09-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Jeuken, Gustavo S.Käll, Lukas

Search in DiVA

By author/editor
Jeuken, Gustavo S.Käll, Lukas
By organisation
Gene TechnologyScience for Life Laboratory, SciLifeLab
In the same journal
PLOS ONE
Genetics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 302 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf