A combined transmembrane topology and signal peptide prediction method
2004 (English)In: Journal of Molecular Biology, ISSN 0022-2836, E-ISSN 1089-8638, Vol. 338, no 5, 1027-1036 p.Article in journal (Refereed) Published
An inherent problem in transmembrane protein topology prediction and signal peptide prediction is the high similarity between the hydrophobic regions of a transmembrane helix and that of a signal peptide, leading to cross-reaction between the two types of predictions. To improve predictions further, it is therefore important to make a predictor that aims to discriminate between the two classes. In addition, topology information can be gained when successfully predicting a signal Peptide leading a trans' membrane protein since it dictates that the N terminus of the mature protein must be on the non-cytoplasmic side of the membrane. Here, we present Phobius, a combined transmembrane protein topology and signal peptide predictor. The predictor is based on a hidden Markov model (HMM) that models the different sequence regions of a signal peptide and the different regions of a transmembrane protein in a series of interconnected states. Training was done on a newly assembled and curated dataset. Compared to TMHMM and SignalP, errors coming from cross-prediction between transmembrane segments and signal peptides were reduced substantially by Phobius. False classifications of signal peptides were reduced from 26.1% to 3.9% and false classifications of transmembrane helices were reduced from 19.0%, to 7.7%. Phobius was applied to the proteomes of Honzo sapiens and Escherichia coli. Here we also noted a drastic reduction of false classifications compared to TMHMM/SignalP, suggesting that Phobius is well suited for whole-genome annotation of signal peptides and transmembrane regions. The method is available at http://phobius.cgb.ki.se/ as well as at http://phobius.binf.ku.dk/.
Place, publisher, year, edition, pages
2004. Vol. 338, no 5, 1027-1036 p.
transmembrane protein, signal peptide, topology prediction, hidden Markov model, machine learning
Bioinformatics (Computational Biology)
IdentifiersURN: urn:nbn:se:kth:diva-48866DOI: 10.1016/j.jmb.2004.03.016ISI: 000221305200014PubMedID: 15111065OAI: oai:DiVA.org:kth-48866DiVA: diva2:458724
QC 201111242011-11-232011-11-232011-11-24Bibliographically approved