Rapid and enhanced remote homology detection by cascading hidden Markov model searches in sequence space
2016 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 32, no 3, 338-344 p.Article in journal (Refereed) PublishedText
Motivation: In the post-genomic era, automatic annotation of protein sequences using computational homology-based methods is highly desirable. However, often protein sequences diverge to an extent where detection of homology and automatic annotation transfer is not straightforward. Sophisticated approaches to detect such distant relationships are needed. We propose a new approach to identify deep evolutionary relationships of proteins to overcome shortcomings of the availablemethods. Results: We have developed a method to identify remote homologues more effectively from any protein sequence database by using several cascading events with Hidden Markov Models (C-HMM). We have implemented clustering of hits and profile generation of hit clusters to effectively reduce the computational timings of the cascaded sequence searches. Our C-HMM approach could cover 94, 83 and 40% coverage at family, superfamily and fold levels, respectively, when applied on diverse protein folds. We have compared C-HMM with various remote homology detection methods and discuss the trade-offs between coverage and false positives.
Place, publisher, year, edition, pages
Oxford University Press, 2016. Vol. 32, no 3, 338-344 p.
Computer Science Biochemistry and Molecular Biology
IdentifiersURN: urn:nbn:se:kth:diva-183321DOI: 10.1093/bioinformatics/btv538ISI: 000370203000004PubMedID: 26454276ScopusID: 2-s2.0-84962263993OAI: oai:DiVA.org:kth-183321DiVA: diva2:910533
QC 201603092016-03-092016-03-072016-03-19Bibliographically approved