Toward an Integrated Machine Learning Model of a Proteomics ExperimentVIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium; Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium.
VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium; Department of Biomolecular Medicine, Faculty of Health Sciences and Medicine, Ghent University, 9000 Ghent, Belgium.
Institute for Systems Biology, Seattle, Washington 98109, United States.
MSAID GmbH, 10559 Berlin, Germany.
Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark.
Department of Biology, Brigham Young University, Provo, Utah 84602, United States.
Institute for Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark.
MSAID GmbH, 85748 Garching, Germany.
Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense, Denmark.
Medical Proteome Analysis, Center for Protein Diagnostics (ProDi), Ruhr University Bochum, 44801 Bochum, Germany; Medizinisches Proteom-Center, Medical Faculty, Ruhr University Bochum, 44801 Bochum, Germany.
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom, Cambridge.
Computational Mass Spectrometry, Technical University of Munich (TUM), 85354 Freising, Germany.
Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands, Postbus 9600.
Show others and affiliations
2023 (English)In: Journal of Proteome Research, ISSN 1535-3893, E-ISSN 1535-3907, Vol. 22, no 3, p. 681-696Article, review/survey (Refereed) Published
Abstract [en]
In recent years machine learning has made extensive progress in modeling many aspects of mass spectrometry data. We brought together proteomics data generators, repository managers, and machine learning experts in a workshop with the goals to evaluate and explore machine learning applications for realistic modeling of data from multidimensional mass spectrometry-based proteomics analysis of any sample or organism. Following this sample-to-data roadmap helped identify knowledge gaps and define needs. Being able to generate bespoke and realistic synthetic data has legitimate and important uses in system suitability, method development, and algorithm benchmarking, while also posing critical ethical questions. The interdisciplinary nature of the workshop informed discussions of what is currently possible and future opportunities and challenges. In the following perspective we summarize these discussions in the hope of conveying our excitement about the potential of machine learning in proteomics and to inspire future research.
Place, publisher, year, edition, pages
American Chemical Society (ACS) , 2023. Vol. 22, no 3, p. 681-696
Keywords [en]
artificial intelligence, deep learning, enzymatic digestion, ion mobility, liquid chromatography, machine learning, research integrity, synthetic data, tandem mass spectrometry
National Category
Bioinformatics (Computational Biology) Bioinformatics and Computational Biology
Identifiers
URN: urn:nbn:se:kth:diva-338423DOI: 10.1021/acs.jproteome.2c00711ISI: 000934905300001PubMedID: 36744821Scopus ID: 2-s2.0-85147873620OAI: oai:DiVA.org:kth-338423DiVA, id: diva2:1806615
Note
QC 20231023
2023-10-232023-10-232025-02-05Bibliographically approved