Taking advantage of phylogenetic trees in comparative genomics
2008 (English)Doctoral thesis, comprehensive summary (Other scientific)
Phylogenomics can be regarded as evolution and genomics in co-operation. Various kinds of evolutionary studies, gene family analysis among them, demand access to genome-scale datasets. But it is also clear that many genomics studies, such as assignment of gene function, are much improved by evolutionary analysis. The work leading to this thesis is a contribution to the phylogenomics field. We have used phylogenetic relationships between species in genome-scale searches for two intriguing genomic features, namely and A-to-I RNA editing. In the first case we used pairwise species comparisons, specifically human-mouse and human-chimpanzee, to infer existence of functional mammalian pseudogenes. In the second case we profited upon later years' rapid growth of the number of sequenced genomes, and used 17-species multiple sequence alignments. In both these studies we have used non-genomic data, gene expression data and synteny relations among these, to verify predictions. In the A-to-I editing project we used 454 sequencing for experimental verification.
We have further contributed a maximum a posteriori (MAP) method for fast and accurate dating analysis of speciations and other evolutionary events. This work follows recent years' trend of leaving the strict molecular clock when performing phylogenetic inference. We discretised the time interval from the leaves to the root in the tree, and used a dynamic programming (DP) algorithm to optimally factorise branch lengths into substitution rates and divergence times. We analysed two biological datasets and compared our results with recent MCMC-based methodologies. The dating point estimates that our method delivers were found to be of high quality while the gain in speed was dramatic.
Finally we applied the DP strategy in a new setting. This time we used a grid laid out on a species tree instead of on an interval. The discretisation gives together with speciation times a common timeframe for a gene tree and the corresponding species tree. This is the key to integration of the sequence evolution process and the gene evolution process. Out of several potential application areas we chose gene tree reconstruction. We performed genome-wide analysis of yeast gene families and found that our methodology performs very well.
Place, publisher, year, edition, pages
Stockholm: KTH , 2008. , 53 p.
Trita-CSC-A, ISSN 1653-5723 ; 2008:09
Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:kth:diva-4757ISBN: 978-91-7178-987-7OAI: oai:DiVA.org:kth-4757DiVA: diva2:13796
2008-06-04, FD05, Albanova, Roslagstullsbacken 21, Stockholm, 09:30
Blanchette, Mathieu, Assistant professor
QC 201009232008-05-162008-05-162010-09-23Bibliographically approved
List of papers