Change search
ReferencesLink to record
Permanent link

Direct link
A cross-validation scheme for machine learning algorithms in shotgun proteomics
KTH, School of Biotechnology (BIO), Gene Technology. KTH, Centres, Science for Life Laboratory, SciLifeLab.ORCID iD: 0000-0001-5689-9797
2012 (English)In: BMC Bioinformatics, ISSN 1471-2105, Vol. 13, S3- p.Article in journal (Refereed) Published
Abstract [en]

Peptides are routinely identified from mass spectrometry-based proteomics experiments by matching observed spectra to peptides derived from protein databases. The error rates of these identifications can be estimated by target-decoy analysis, which involves matching spectra to shuffled or reversed peptides. Besides estimating error rates, decoy searches can be used by semi-supervised machine learning algorithms to increase the number of confidently identified peptides. As for all machine learning algorithms, however, the results must be validated to avoid issues such as overfitting or biased learning, which would produce unreliable peptide identifications. Here, we discuss how the target-decoy method is employed in machine learning for shotgun proteomics, focusing on how the results can be validated by cross-validation, a frequently used validation scheme in machine learning. We also use simulated data to demonstrate the proposed cross-validation scheme's ability to detect overfitting.

Place, publisher, year, edition, pages
2012. Vol. 13, S3- p.
Keyword [en]
Tandem Mass-Spectrometry, False Discovery Rate, Peptide Identification, Statistical Significance, Protein Identifications, Database Search, Spectra, Model, Probabilities, Networks
National Category
Biological Sciences
URN: urn:nbn:se:kth:diva-116737DOI: 10.1186/1471-2105-13-S16-S3ISI: 000312714500003OAI: diva2:600697
Swedish Research CouncilSwedish Foundation for Strategic Research Science for Life Laboratory - a national resource center for high-throughput molecular bioscience

QC 20130125

Available from: 2013-01-25 Created: 2013-01-25 Last updated: 2013-01-25Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Käll, Lukas
By organisation
Gene TechnologyScience for Life Laboratory, SciLifeLab
In the same journal
BMC Bioinformatics
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 35 hits
ReferencesLink to record
Permanent link

Direct link