Reconstruction of Ancestral Genomic Sequences Using Likelihood
2007 (English)In: Journal of Computational Biology, ISSN 1066-5277, E-ISSN 1557-8666, Vol. 14, no 2, 216-237 p.Article in journal (Refereed) Published
A challenging task in computational biology is the reconstruction of genomic sequences of extinct ancestors, given the phylogenetic tree and the sequences at the leafs. This task is best solved by calculating the most likely estimate of the ancestral sequences, along with the most likely edge lengths. We deal with this problem and also the variant in which the phylogenetic tree in addition to the ancestral sequences need to be estimated. The latter problem is known to be NP-hard, while the computational complexity of the former is unknown. Currently, all algorithms for solving these problems are heuristics without performance guarantees. The biological importance of these problems calls for developing better algorithms with guarantees of finding either optimal or approximate solutions. We develop approximation, fix parameter tractable ( FPT), and fast heuristic algorithms for two variants of the problem; when the phylogenetic tree is known and when it is unknown. The approximation algorithm guarantees a solution with a log- likelihood ratio of 2 relative to the optimal solution. The FPT has a running time which is polynomial in the length of the sequences and exponential in the number of taxa. This makes it useful for calculating the optimal solution for small trees. Moreover, we combine the approximation algorithm and the FPT into an algorithm with arbitrary good approximation guarantee ( PTAS). We tested our algorithms on both synthetic and biological data. In particular, we used the FPT for computing the most likely ancestral mitochondrial genomes of hominidae ( the great apes), thereby answering an interesting biological question. Moreover, we show how the approximation algorithms find good solutions for reconstructing the ancestral genomes for a set of lentiviruses ( relatives of HIV). Supplementary material of this work is available at www.nada.kth.se/(similar to)isaac/publications/aml/aml.html.
Place, publisher, year, edition, pages
2007. Vol. 14, no 2, 216-237 p.
ancestral maximum likelihood, most parsimonious likelihood, PTAS, FPT, 2-approximation
IdentifiersURN: urn:nbn:se:kth:diva-6356DOI: 10.1089/cmb.2006.0101ISI: 000245943700006ScopusID: 2-s2.0-34248184703OAI: oai:DiVA.org:kth-6356DiVA: diva2:11045
QC 20110121. Uppdaterad från manuskript till artikel2006-11-152006-11-152012-09-26Bibliographically approved