kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Pathway Analysis Through Mutual Information
KTH, Centres, Science for Life Laboratory, SciLifeLab. KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology.ORCID iD: 0000-0002-4438-2325
KTH, Centres, Science for Life Laboratory, SciLifeLab. KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Gene Technology.ORCID iD: 0000-0001-5689-9797
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Pathway analysis comes in many forms. Most are seeking to establish a connection between the activity of a certain biological pathway and a difference in phenotype, often relying on an upstream differential expression analysis to establish the difference between case and control. This process usually models this relationship using many assumptions, often of a linear nature, and may also involve statistical tests where the calculation of false discovery rates is not trivial.

Here, we propose a new method for pathway analysis, MIPath, that relies on information theoretical principles, and therefore is absent of a model for the nature of the association between pathway activity and phenotype, resulting on a very minimal set of assumptions. For this, we construct a different graph of samples for each pathway and score the association between the structure of this graph and any phenotype variable using Mutual Information, while adjusting for the effects of random chance in each score.

Our experiments show that this method produces robust and reproducible scores that successfully result in a high rank for target pathways on single cell datasets, outperforming established methods for pathway analysis on these same conditions.

National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:kth:diva-316544OAI: oai:DiVA.org:kth-316544DiVA, id: diva2:1689313
Note

QC 20220831

Available from: 2022-08-22 Created: 2022-08-22 Last updated: 2022-08-31Bibliographically approved
In thesis
1. Pathway analysis: methods and perspectives
Open this publication in new window or tab >>Pathway analysis: methods and perspectives
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The amount of data being generated by high throughput molecular biology experiments grows every day, both in quantity and quality. With this comes the desire to have more powerful and comprehensive methods for statistical analysis that have been developed with the nature of this data in mind.

One of the lines of research that has been developed with this specific goal in mind is pathway analysis. Here, pathways are units of information that have been curated in a way that makes biological knowledge of cellular processes available in a programmatic way, and pathway analysis methods make use of this information to help understand the results of high throughput experiments.

This is an exploratory thesis on the field of pathway analysis. I give a brief introduction to the field, what motivated its development, the problems it tries to solve, and some of the proposed statistical methods, together with some discussion on the implications of this type of analysis.

I then present three original works on pathway analysis, each with a different perspective on the task. First, we present a more reliable null model for pathway analysis methods that use functional association networks, which results in better-calibrated statistics. Second, we show how we can combine pathway analysis methods with other statistical methods, such as survival analysis. We applied this method to a large breast cancer cohort and show that in this case pathways provide better prognostic power than individual genes. Third, we leverage concepts from information theory to design an original pathway analysis method that is very sensitive and flexible, while being practically without parameters. Together, all three papers contribute to furthering the field's usefulness and to the understanding of this type of analysis. 

Abstract [sv]

Mängden data som genereras i storskaliga molekylärbiologiska experiment ökar stadigt, både i kvantitet och kvalitet. Som en konsekvens ökar behovet av kraftfullare och mer omfattande metoder för tolkning och statistisk analys av sådan data.

En forskningsmetodik som försöker lösa problem associerade med den statistiska analysen utav stora blandade biologiska datamängder är pathway-analys (från engelskans pathway; gångväg eller sekvens av steg). En biologisk eller biomedicinsk pathway är en enhet av annoterad information, som har kurerats på ett sådant sätt att den representerar tidigare biologisk kunskap. Den programmatiskt tillgängliga informationen över rimliga kopplingar i den stora datamängden kan innefatta metabola processer, cellulär lokalisering eller biokemisk funktion. Den stora mängden pathways möjliggör sedan systematisk dataintegrering och ökad förståelse utav stora datamängder från hög-kapacitets experiment.

I denna avhandling beskriver vi pathway-analys genom att först ge en kort introduktion till teknikerna, vad som motiverade dess utveckling, de problem pathway-analys försöker lösa och några av de föreslagna statistiska metoderna, tillsammans med en del diskussion om implikationerna av denna typ av analys.

Jag presenterar sedan tre publikationer om pathway-analys, var och en med olika perspektiv på uppgiften. Först presenterar vi en mer tillförlitlig, graf baserad, statistisk null-modell för pathway-analysmetoder som bygger på funktionella associationsnätverk, vilket resulterar i bättre kalibrerad statistik. I den andra artikeln visar vi hur vi kan kombinera pathway-analysmetodik med andra statistiska metoder, såsom överlevnadsanalys. Vi tillämpade denna metod på en stor bröstcancerkohort och visar att i detta fall ger pathways bättre prognostisk kraft än enskilda gener. I den tredje artikeln utnyttjar vi begrepp från informationsteori för att designa en förbättrad pathway-analysmetodik, som är mycket känslig och flexibel, samtidigt som den är praktiskt taget utan parametrar. Tillsammans bidrar alla tre artiklarna till att öka fältets användbarhet och förståelsen för denna typ av analys.

Place, publisher, year, edition, pages
Stockholm, Sweden: KTH Royal Institute of Technology, 2022. p. 53
Series
TRITA-CBH-FOU ; 2022:41
Keywords
pathway analysis, mutual information, survival analysis, enrichment analysis, transcriptomics.
National Category
Bioinformatics (Computational Biology)
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-316604 (URN)978-91-8040-322-1 (ISBN)
Public defence
2022-09-28, Air & Fire, Tomtebodavägen 23A, via Zoom: https://kth-se.zoom.us/j/61760412942, Solna, 14:00 (English)
Opponent
Supervisors
Note

QC 2022-08-25

Available from: 2022-08-25 Created: 2022-08-24 Last updated: 2022-09-16Bibliographically approved

Open Access in DiVA

fulltext(769 kB)110 downloads
File information
File name FULLTEXT01.pdfFile size 769 kBChecksum SHA-512
5f4e000f27fcabc754f871bbf29e1f8e1948a9fdddf1b0df4eb9d36fddc80be201c49ba5dc4b62319a69baddea9bd17648874776f708b9d3a8508e13100f9529
Type fulltextMimetype application/pdf

Other links

Preprint on bioRxiv

Authority records

Jeuken, Gustavo S.Käll, Lukas

Search in DiVA

By author/editor
Jeuken, Gustavo S.Käll, Lukas
By organisation
Science for Life Laboratory, SciLifeLabGene Technology
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 110 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 126 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf