Change search
ReferencesLink to record
Permanent link

Direct link
Probabilistic Orthology Analysis
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB.
2009 (English)In: Systematic Biology, ISSN 1063-5157, E-ISSN 1076-836X, Vol. 58, no 4, 411-424 p.Article in journal (Refereed) Published
Abstract [en]

Orthology analysis aims at identifying orthologous genes and gene products from different organisms and, therefore, is a powerful tool in modern computational and experimental biology. Although reconciliation-based orthology methods are generally considered more accurate than distance-based ones, the traditional parsimony-based implementation of reconciliation-based orthology analysis (most parsimonious reconciliation [MPR]) suffers from a number of shortcomings. For example, 1) it is limited to orthology predictions from the reconciliation that minimizes the number of gene duplication and loss events, 2) it cannot evaluate the support of this reconciliation in relation to the other reconciliations, and 3) it cannot make use of prior knowledge (e.g., about species divergence times) that provides auxiliary information for orthology predictions. We present a probabilistic approach to reconciliation-based orthology analysis that addresses all these issues by estimating orthology probabilities. The method is based on the gene evolution model, an explicit evolutionary model for gene duplication and gene loss inside a species tree, that generalizes the standard birth-death process. We describe the probabilistic approach to orthology analysis using 2 experimental data sets and show that the use of orthology probabilities allows a more informative analysis than MPR and, in particular, that it is less sensitive to taxon sampling problems. We generalize these anecdotal observations and show, using data generated under biologically realistic conditions, that MPR give false orthology predictions at a substantial frequency. Last, we provide a new orthology prediction method that allows an orthology and paralogy classification with any chosen sensitivity/specificity combination from the spectra of achievable combinations. We conclude that probabilistic orthology analysis is a strong and more advanced alternative to traditional orthology analysis and that it provides a framework for sophisticated comparative studies of processes in genome evolution.

Place, publisher, year, edition, pages
2009. Vol. 58, no 4, 411-424 p.
Keyword [en]
Comparative genomics, gene duplication, gene loss, orthology, paralogy, probabilistic modeling, phylogenetics, tree reconciliation, multiple gene loci, molecular phylogenies, multigene families, reconciled trees, death evolution, purifying selection, divergence, times, cog database, genome, inference
URN: urn:nbn:se:kth:diva-18774DOI: 10.1093/sysbio/syp046ISI: 000270005000003ScopusID: 2-s2.0-70349413154OAI: diva2:336821
QC 20100525Available from: 2010-08-05 Created: 2010-08-05 Last updated: 2011-01-14Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Lagergren, Jens
By organisation
Computational Biology, CB
In the same journal
Systematic Biology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 39 hits
ReferencesLink to record
Permanent link

Direct link