Change search
ReferencesLink to record
Permanent link

Direct link
Detecting LGTs using a novel probabilistic modelintegrating duplications, LGTs, losses, rate variation,and sequence evolution
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
Show others and affiliations
2009 (English)Manuscript (preprint) (Other academic)
Abstract [en]

The debate over the prevalence of lateral gene transfers (LGTs) has been intense.There is now to a large extent consensus around the view that LGT is an important evolutionary force as well as regarding its relative importance across species. This consensus relies, however, mainly on studies of individual gene families.

Up until now, the gold standard for identifying LGTs has been phylogenetic methods where LGTs are inferred from incongruities between a species tree andan associated gene tree. Even in cases where there is evidence of LGT, several concerns have often been raised regarding the significance of the evidence. One common concern has been the possibility that other evolutionary events have caused the incongruities. Another has been the significance of the gene trees involved in the inference; there may for instance be alternative, almost equally likely, gene trees that do not provide evidence for LGT. Independently of these concerns, there has been a need for methods that can be used to quantitatively characterize the level of LGT among sets of species, but also for methods able to pinpoint where in the species tree LGTs have occurred.

Here, we provide the first probabilistic model capturing gene duplication, LGT,gene loss, and point mutations with a relaxed molecular clock. We also provide allfundamental algorithms required to analyze a gene family relative to a given speciestree under this model. Our algorithms are based on Markov chain Monte Carlo(MCMC) methodology but build also on techniques from numerical analysis and involve dynamic programming (DP).

Place, publisher, year, edition, pages
National Category
Industrial Biotechnology Computer Science
URN: urn:nbn:se:kth:diva-10971OAI: diva2:233574

QC 20100812

Available from: 2009-09-01 Created: 2009-09-01 Last updated: 2016-02-02Bibliographically approved
In thesis
1. Using Trees to Capture Reticulate Evolution: Lateral Gene Transfers and Cancer Progression
Open this publication in new window or tab >>Using Trees to Capture Reticulate Evolution: Lateral Gene Transfers and Cancer Progression
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The historic relationship of species and genes are traditionally depicted using trees. However, not all evolutionary histories are adequately captured by bifurcating processes and an increasing amount of research is devoted towards using networks or network-like structures to capture evolutionary history. Lateral gene transfer (LGT) is a previously controversial mechanism responsible for non tree-like evolutionary histories, and is today accepted as a major force of evolution, particularly in the prokaryotic domain.

In this thesis, we present models of gene evolution incorporating both LGTs and duplications, together with efficient computational methods for various inference problems. Specifically, we define a biologically sound combinatorial model for reconciliation of species and gene trees that facilitates simultaneous consideration of duplications and LGTs. We prove that finding most parsimonious reconciliations is NP-hard, but that the problem can be solved efficiently if reconciliations are not required to be acyclic—a condition that is satisfied when analyzing most real-world datasets. We also provide a polynomial-time algorithm for parametric tree reconciliation, a problem analogous to parametric sequence alignment, that enables us to study the entire space of optimal reconciliations under all possible cost schemes.

Going beyond combinatorial models, we define the first probabilistic model of gene evolution incorporating a birth-death process generating duplications, LGTs, and losses, together with a relaxed molecular clock model of sequence evolution. Algorithms based on Markov chain Monte Carlo (MCMC) techniques, methods from numerical analysis, and dynamic programming are presented for various probability and parameter inference problems.

Finally, we develop methods for analysis of cancer progression, a biological process with many similarities to the process of evolution. Cancer progresses by accumulation of harmful genetic aberrations whose patterns of emergence are graph-like. We develop a model of cancer progression based on trees, and mixtures thereof, that admits an efficient structural EM algorithm for finding Maximum Likelihood (ML) solutions from available cross-sectional data.

Place, publisher, year, edition, pages
Stockholm: KTH, 2009. viii, 68 p.
Trita-CSC-A, ISSN 1653-5723 ; 2009:10
Lateral Gene Tranfer, Horizontal Gene Transfer, Cancer Progression
National Category
Bioinformatics and Systems Biology Computer Science
urn:nbn:se:kth:diva-10608 (URN)978-91-7415-349-1 (ISBN)
Public defence
2009-06-12, Svedbergssalen, Albanova, Roslagstullsbacken 21, Stockholm, 10:00 (English)
QC 20100812Available from: 2009-06-04 Created: 2009-06-02 Last updated: 2010-08-12Bibliographically approved

Open Access in DiVA

fulltext(186 kB)222 downloads
File information
File name FULLTEXT01.pdfFile size 186 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Tofigh, AliLagergren, Jens
By organisation
Computational Biology, CB
Industrial BiotechnologyComputer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 222 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 181 hits
ReferencesLink to record
Permanent link

Direct link