Change search
ReferencesLink to record
Permanent link

Direct link
Assessment of protein distance measures and tree-building methods for phylogenetic tree reconstruction
KTH, School of Computer Science and Communication (CSC), Numerical Analysis and Computer Science, NADA.
2005 (English)In: Molecular biology and evolution, ISSN 0737-4038, E-ISSN 1537-1719, Vol. 22, no 11, 2257-2264 p.Article in journal (Refereed) Published
Abstract [en]

Distance-based methods are popular for reconstructing evolutionary trees of protein sequences, mainly because of their speed and generality. A number of variants of the classical neighbor-joining (NJ) algorithm have been proposed, as well as a number of methods to estimate protein distances. We here present a large-scale assessment of performance in reconstructing the correct tree topology for the most popular algorithms. The programs BIONJ, FastME, Weighbor, and standard NJ were run using 12 distance estimators, producing 48 tree-building/distance estimation method combinations. These were evaluated on a test set based on real trees taken from 100 Pfam families. Each tree was used to generate multiple sequence alignments with the ROSE program using three evolutionary models. The accuracy of each method was analyzed as a function of both sequence divergence and location in the tree. We found that BIONJ produced the overall best results, although the average accuracy differed little between the tree-building methods (normally less than 1%). A noticeable trend was that FastME performed poorer than the rest on long branches. Weighbor was several orders of magnitude slower than the other programs. Larger differences were observed when using different distance estimators. Protein-adapted Jukes-Cantor and Kimura distance correction produced clearly poorer results than the other methods, even worse than uncorrected distances. We also assessed the recently developed Scoredist measure, which performed equally well as more complex methods.

Place, publisher, year, edition, pages
2005. Vol. 22, no 11, 2257-2264 p.
Keyword [en]
protein distance estimation, phylogenetic tree reconstruction, neighbor-joining, maximum-likelihood approach, minimum-evolution, sequences, inference, families, efficiencies, matrices, model
URN: urn:nbn:se:kth:diva-15092DOI: 10.1093/molbev/msi224ISI: 000232426500015ScopusID: 2-s2.0-26444452202OAI: diva2:333133
QC 20100525Available from: 2010-08-05 Created: 2010-08-05Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Arvestad, Lars
By organisation
Numerical Analysis and Computer Science, NADA
In the same journal
Molecular biology and evolution

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 19 hits
ReferencesLink to record
Permanent link

Direct link