Change search
ReferencesLink to record
Permanent link

Direct link
Identification of related proteins on family, superfamily and fold level.
KTH, Superseded Departments, Physics.ORCID iD: 0000-0002-2734-2794
Stockholm University.
2000 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 295, no 3, 613-25 p.Article in journal (Refereed) Published
Abstract [en]

Proteins might have considerable structural similarities even when no evolutionary relationship of their sequences can be detected. This property is often referred to as the proteins sharing only a "fold". Of course, there are also sequences of common origin in each fold, called a "superfamily", and in them groups of sequences with clear similarities, designated "family". Developing algorithms to reliably identify proteins related at any level is one of the most important challenges in the fast growing field of bioinformatics today. However, it is not at all certain that a method proficient at finding sequence similarities performs well at the other levels, or vice versa.Here, we have compared the performance of various search methods on these different levels of similarity. As expected, we show that it becomes much harder to detect proteins as their sequences diverge. For family related sequences the best method gets 75% of the top hits correct. When the sequences differ but the proteins belong to the same superfamily this drops to 29%, and in the case of proteins with only fold similarity it is as low as 15%. We have made a more complete analysis of the performance of different algorithms than earlier studies, also including threading methods in the comparison. Using this method a more detailed picture emerges, showing multiple sequence information to improve detection on the two closer levels of relationship. We have also compared the different methods of including this information in prediction algorithms. For lower specificities, the best scheme to use is a linking method connecting proteins through an intermediate hit. For higher specificities, better performance is obtained by PSI-BLAST and some procedures using hidden Markov models. We also show that a threading method, THREADER, performs significantly better than any other method at fold recognition.

Place, publisher, year, edition, pages
2000. Vol. 295, no 3, 613-25 p.
National Category
Bioinformatics and Systems Biology
URN: urn:nbn:se:kth:diva-82637DOI: 10.1006/jmbi.1999.3377PubMedID: 10623551OAI: diva2:498403
NR 20140805Available from: 2012-02-12 Created: 2012-02-12 Last updated: 2012-02-12Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMed

Search in DiVA

By author/editor
Lindahl, E
By organisation
In the same journal
Journal of Molecular Biology
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 14 hits
ReferencesLink to record
Permanent link

Direct link