Metrics for the Human Proteome Project 2013-2014 and Strategies for Finding Missing Proteins
2014 (English)In: Journal of Proteome Research, ISSN 1535-3893, E-ISSN 1535-3907, Vol. 13, no 1, 15-20 p.Article in journal (Refereed) Published
One year ago the Human Proteome Project (HPP) leadership designated the baseline metrics for the Human Proteome Project to be based on neXtProt with a total of 13 664 proteins validated at protein evidence level 1 (PE1) by mass spectrometry, antibody-capture, Edman sequencing, or 3D structures. Corresponding chromosome-specific data were provided from PeptideAtlas, GPMdb, and Human Protein Atlas. This year, the neXtProt total is 15 646 and the other resources, which are inputs to neXtProt, have high-quality identifications and additional annotations for 14 012 in PeptideAtlas, 14 869 in GPMdb, and 10 976 in HPA. We propose to remove 638 genes from the denominator that are "uncertain" or "dubious" in Ensembl, UniProt/SwissProt, and neXtProt. That leaves 3844 "missing proteins", currently having no or inadequate documentation, to be found from a new denominator of 19 490 protein-coding genes. We present those tabulations and web links and discuss current strategies to find the missing proteins.
Place, publisher, year, edition, pages
2014. Vol. 13, no 1, 15-20 p.
human proteome project, neXtProt, PeptideAtlas, GPMdb, human protein atlas, metrics, missing proteins
Biochemistry and Molecular Biology
IdentifiersURN: urn:nbn:se:kth:diva-141313DOI: 10.1021/pr401144xISI: 000329472700003ScopusID: 2-s2.0-84891818310OAI: oai:DiVA.org:kth-141313DiVA: diva2:696643
FunderEU, FP7, Seventh Framework ProgrammeKnut and Alice Wallenberg Foundation
QC 201402142014-02-142014-02-132014-02-14Bibliographically approved