kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 17) Show all publications
Le, T., Winsnes, C. F., Axelsson, U., Xu, H., Mohanakrishnan Kaimal, J., Mahdessian, D., . . . Lundberg, E. (2022). Analysis of the Human Protein Atlas Weakly Supervised Single-Cell Classification competition. Nature Methods, 19(10), 1221-1229
Open this publication in new window or tab >>Analysis of the Human Protein Atlas Weakly Supervised Single-Cell Classification competition
Show others...
2022 (English)In: Nature Methods, ISSN 1548-7091, E-ISSN 1548-7105, Vol. 19, no 10, p. 1221-1229Article in journal (Refereed) Published
Abstract [en]

While spatial proteomics by fluorescence imaging has quickly become an essential discovery tool for researchers, fast and scalable methods to classify and embed single-cell protein distributions in such images are lacking. Here, we present the design and analysis of the results from the competition Human Protein Atlas – Single-Cell Classification hosted on the Kaggle platform. This represents a crowd-sourced competition to develop machine learning models trained on limited annotations to label single-cell protein patterns in fluorescent images. The particular challenges of this competition include class imbalance, weak labels and multi-label classification, prompting competitors to apply a wide range of approaches in their solutions. The winning models serve as the first subcellular omics tools that can annotate single-cell locations, extract single-cell features and capture cellular dynamics. 

Place, publisher, year, edition, pages
Springer Nature, 2022
Keywords
cell protein, protein, Article, cell nucleus inclusion body, classification, competition, computer model, fluorescence imaging, machine learning, multilabel classification, protein function, protein localization, proteomics, single cell analysis, human, Humans, Proteins
National Category
Cell Biology
Identifiers
urn:nbn:se:kth:diva-328119 (URN)10.1038/s41592-022-01606-z (DOI)000863153600001 ()36175767 (PubMedID)2-s2.0-85139247548 (Scopus ID)
Note

QC 20230602

Available from: 2023-06-02 Created: 2023-06-02 Last updated: 2023-06-02Bibliographically approved
Winsnes, C. F. (2022). On computational methods for spatial mapping of the human proteome. (Doctoral dissertation). Stockholm: KTH Royal Institute of Technology
Open this publication in new window or tab >>On computational methods for spatial mapping of the human proteome
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Proteins are complex molecules that are involved in almost every task in the body. In general, the role a protein fulfills is highly dependent on where in the cell it is located, its subcellular localization. In order to understand human biology, it is therefore imperative to gain insight into the world of proteins by examining their subcellular distribution and interaction with each other. This thesis focuses on the development of computational models capable of performing large scale spatial protein analysis on a subcellular level. Within that scope, we were able to develop models that classify the localization of proteins in immunofluorescence microscopy images as well as show how such models can integrate with other methods to gain novel insights and understanding into the roles and spatially dependent functions of proteins. 

In Paper I, we present and combine two separate methods for large scale protein localization. The first method is an integration of a protein localization task as a mini-game within an established massively multiplayer online video game. The second method consists of the first image-based deep neural network learning model capable of multi-label subcellular localization classification. We show that both these methods enable accurate and scalable high-throughput analysis of subcellular protein localization that overcome many of the challenges associated with such a dataset. We also show that combining the two methods yield better results than either of them do on their own, resulting in a model that is nearing human performance. 

In Paper II, based on the success of the neural network model from Paper I, we continue the investigation into usage of deep neural networks for the purpose of subcellular protein localization. In an effort to find the best possible model for such tasks, a machine learning image competition was developed. Over 2,000 teams participated with various kinds of architectures, resulting in a predictor that far outperforms the one presented in Paper I. The winning model is analyzed thoroughly, and we show that its internal feature representation contains biologically relevant information and that it can be used for quantitative analysis of protein patterns.

 Paper III takes the feature representation of immunofluorescence images from the model developed in Paper II and integrates it with features extracted from affinity purification experiments to create a hierarchical map of the human cell’s architecture. This method creates a map of protein communities grouped by subcellular structures, of which approximately 54% are putatively novel. We show that the map is biologically significant by validating several of the novel findings using affinity purification experiments and in-situ fractionation. 

In Paper IV, we apply what was learned in Paper I and II in order to create a model that identifies proteins residing within micronuclei. We apply the model on the image data from the Human Protein Atlas to create the first extensive mapping of the micronuclear proteome. Through enrichment analysis of the identified proteins, we propose that micronuclei harbor a more diverse set of functions than previously thought. We find that the micronuclear proteome is highly interconnected and contains many proteins that show visible variations across different micronuclei, and theorize on what this means for their role in the cell.

In conclusion, Paper I and II examine and establish the possibilities of using deep neural networks for systematic subcellular protein localization analysis. Paper III and IV build upon what was learned in Papers I and II and use their models to examine protein distribution patterns and provide novel biological insights.

Abstract [sv]

Proteiner är komplexa molekyler som är inblandade i nära nog varje kroppslig funktion. Överlag är ett proteins roll högst beroende av var i cellen det befinner sig, dess subcellulära lokalisation. För att förstå mänsklig biologi är det därför nödvändigt att få insikt i proteinernas värld genom att undersöka deras subcellulära distribution och hur de interagerar med varandra. Den här avhandlingen fokuserar på utvecklandet av datormodeller kapabla att genomföra storskalig spatiell proteinanalys på en subcellulär nivå. Inom detta tillämpningsområde kunde vi utveckla modeller för att klassificera lokaliseringen av proteiner i immunofluorescensmikroskopibilder och visa hur sådana modeller kan interagera med andra metoder för nya insikter i proteiners roller och deras rumsberoende funktioner.

I Artikel I presenterar vi och kombinerar två separata metoder för storskalig proteinlokalisering. Den första metoden är en integration av en proteinlokaliseringsuppgift som ett minispel i ett etablerat massivt onlinespel. Den andra metoden består av den första bildbaserade djupa neuralnätverksmodellen kapabel att multietikettklassificera subcellulär proteinlokalisering. Vi visar att båda metoderna gör det möjligt att genomföra precisa och skalbara analyser av subcellulär proteinlokalisering, med hög genomströmning, som överkommer många av de svårigheter som är associerade med sådana dataset. Vi visar också att en kombination av de två metoderna producerar bättre resultat än var metod gör för sig och resulterar i en modell som närmar sig mänsklig prestanda.

I Artikel II fortsätter vi, baserat på framgången med Artikel I:s neuralnätverksmodell, undersöka användningen av djupa neuralnätverk för subcellulär proteinlokalisering. I ett försök att hitta den bästa möjliga modellen för sådana uppgifter utvecklade vi en bildbaserad maskininlärningstävling. Över 2.000 lag deltog med olika typer av arkitekturer, vilket resulterade i en prediktor som långt överträffar den som presenterades i Artikel I. Den vinnande modellen blir noggrant analyserad och vi visar att dess interna numeriska representation innehåller biologiskt relevant information samt att dessa kan användas för kvantiativ analys av proteinmönster.

Artikel III använder den numeriska representationen av immunofluorescensbilder från modellen utvecklad i Artikel II och integrerar den med en numerisk representation extraherad från affinitetsreningsexperiment för att skapa en hierarkisk karta över den mänskliga cellens arkitektur. Denna metod gör en kartläggning över grupper av proteiner, av vilka cirka 54% av grupperna är förmodat nya. Vi visar att kartläggningen är biologiskt signifikant genom att validera ett flertal av de nya upptäckterna med affinitetsreningsexperiment och insitu fraktionering.

I Artikel IV applicerar vi vad vi lärt oss från Artikel I och II för att skapa en modell som identifierar proteiner som befinner sig i mikrokärnor. Vi applicerar modellen på bilddata från Human Protein Atlas för att skapa den första omfattande kartläggningen av mikrokärneproteomet. Med hjälp av anrikningsanalys föreslår vi att mikrokärnor har en mer mångfaldig funktionalitet än vad som tidigare har antagits. Vi finner att mikrokärneproteomet är starkt sammanlänkat samt innehåller många proteiner som uppvisar variation mellan olika mikrokärnor och diskuterar vad detta betyder för deras roll i cellen.

Sammanfattat, Artikel I och II undersöker och etablerar möjligheterna för användning av djupa neuralnätverk för systematisk subcellulär proteinlokaliseringsanalys. Artikel III och IV bygger vidare på vad vi lärt oss i Artikel I och II och använder deras modeller för att undersöka proteindistributionsmönster och förser oss med nya biologiska insikter.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. p. 59
Series
TRITA-CBH-FOU ; 2022:65
Keywords
Spatial proteomics, Human Protein Atlas, Micronuclei, Citizen science
National Category
Cell Biology
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-321477 (URN)978-91-8040-437-2 (ISBN)
Public defence
2022-12-16, Samuelssonsalen, Tomtebodavägen 6, Solna, 10:09 (English)
Opponent
Supervisors
Funder
Swedish Research CouncilKnut and Alice Wallenberg Foundation
Note

QC 2022-11-17

Available from: 2022-11-17 Created: 2022-11-16 Last updated: 2022-12-08Bibliographically approved
Qin, Y., Huttlin, E. L., Winsnes, C. F., Gosztyla, M. L., Wacheul, L., Kelly, M. R., . . . Ideker, T. (2021). A multi-scale map of cell structure fusing protein images and interactions. Nature, 600(7889), 536-+
Open this publication in new window or tab >>A multi-scale map of cell structure fusing protein images and interactions
Show others...
2021 (English)In: Nature, ISSN 0028-0836, E-ISSN 1476-4687, Vol. 600, no 7889, p. 536-+Article in journal (Refereed) Published
Abstract [en]

The cell is a multi-scale structure with modular organization across at least four orders of magnitude(1). Two central approaches for mapping this structure-protein fluorescent imaging and protein biophysical association-each generate extensive datasets, but of distinct qualities and resolutions that are typically treated separately(2,3). Here we integrate immunofluorescence images in the Human Protein Atlas(4) with affinity purifications in BioPlex(5) to create a unified hierarchical map of human cell architecture. Integration is achieved by configuring each approach as a general measure of protein distance, then calibrating the two measures using machine learning. The map, known as the multi-scale integrated cell (MuSIC 1.0), resolves 69 subcellular systems, of which approximately half are to our knowledge undocumented. Accordingly, we perform 134 additional affinity purifications and validate subunit associations for the majority of systems. The map reveals a pre-ribosomal RNA processing assembly and accessory factors, which we show govern rRNA maturation, and functional roles for SRRM1 and FAM120C in chromatin and RPS3A in splicing. By integration across scales, MuSIC increases the resolution of imaging while giving protein interactions a spatial dimension, paving the way to incorporate diverse types of data in proteome-wide cell maps.

Place, publisher, year, edition, pages
Springer Nature, 2021
National Category
Pharmaceutical and Medical Biotechnology
Identifiers
urn:nbn:se:kth:diva-306856 (URN)10.1038/s41586-021-04115-9 (DOI)000730754700043 ()34819669 (PubMedID)2-s2.0-85120644334 (Scopus ID)
Note

QC 20220110

Available from: 2022-01-10 Created: 2022-01-10 Last updated: 2025-02-17Bibliographically approved
Ouyang, W., Winsnes, C. F., Hjelmare, M., Åkesson, L., Xu, H., Sullivan, D. P., . . . Et al, . (2020). Analysis of the Human Protein Atlas Image Classification competition (vol 16, pg 1254, 2019). Nature Methods, 17(1), 115-115
Open this publication in new window or tab >>Analysis of the Human Protein Atlas Image Classification competition (vol 16, pg 1254, 2019)
Show others...
2020 (English)In: Nature Methods, ISSN 1548-7091, E-ISSN 1548-7105, Vol. 17, no 1, p. 115-115Article in journal (Refereed) Published
Place, publisher, year, edition, pages
Springer Nature, 2020
Identifiers
urn:nbn:se:kth:diva-300743 (URN)10.1038/s41592-019-0699-x (DOI)000508582900046 ()31822866 (PubMedID)2-s2.0-85076415178 (Scopus ID)
Note

QC 20210902

Available from: 2021-09-02 Created: 2021-09-02 Last updated: 2022-06-25Bibliographically approved
Ouyang, W., Winsnes, C. F., Hjelmare, M., Åkesson, L., Xu, H., Sullivan, D. P., . . . Et al, . (2020). Analysis of the Human Protein Atlas Image Classification competition (vol 54, pg 2112, 2019). Nature Methods, 17(2), 241-241
Open this publication in new window or tab >>Analysis of the Human Protein Atlas Image Classification competition (vol 54, pg 2112, 2019)
Show others...
2020 (English)In: Nature Methods, ISSN 1548-7091, E-ISSN 1548-7105, Vol. 17, no 2, p. 241-241Article in journal (Refereed) Published
Abstract [en]

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

Place, publisher, year, edition, pages
Springer Nature, 2020
National Category
Medical and Health Sciences
Identifiers
urn:nbn:se:kth:diva-300722 (URN)10.1038/s41592-020-0734-y (DOI)000508797500002 ()31969731 (PubMedID)2-s2.0-85078147986 (Scopus ID)
Note

QC 20210902

Available from: 2021-09-02 Created: 2021-09-02 Last updated: 2022-06-25Bibliographically approved
Ouyang, W., Winsnes, C. F., Hjelmare, M., Åkesson, L., Xu, H., Sullivan, D. P. & Lundberg, E. (2019). Analysis of the Human Protein Atlas Image Classification competition. Nature Methods, 16(12), 1254-+
Open this publication in new window or tab >>Analysis of the Human Protein Atlas Image Classification competition
Show others...
2019 (English)In: Nature Methods, ISSN 1548-7091, E-ISSN 1548-7105, Vol. 16, no 12, p. 1254-+Article in journal (Refereed) Published
Abstract [en]

Pinpointing subcellular protein localizations from microscopy images is easy to the trained eye, but challenging to automate. Based on the Human Protein Atlas image collection, we held a competition to identify deep learning solutions to solve this task. Challenges included training on highly imbalanced classes and predicting multiple labels per image. Over 3 months, 2,172 teams participated. Despite convergence on popular networks and training techniques, there was considerable variety among the solutions. Participants applied strategies for modifying neural networks and loss functions, augmenting data and using pretrained networks. The winning models far outperformed our previous effort at multi-label classification of protein localization patterns by similar to 20%. These models can be used as classifiers to annotate new images, feature extractors to measure pattern similarity or pretrained networks for a wide range of biological applications.

Place, publisher, year, edition, pages
Springer Nature, 2019
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:kth:diva-266288 (URN)10.1038/s41592-019-0658-6 (DOI)000499653100025 ()31780840 (PubMedID)2-s2.0-85075762199 (Scopus ID)
Note

Correction in DOI: 10.1038/s41592-019-0699-x ISI: 000508582900046

QC 20200329

Available from: 2020-01-07 Created: 2020-01-07 Last updated: 2022-11-16Bibliographically approved
Sullivan, D. P., Winsnes, C. F., Åkesson, L., Hjelmare, M., Wiking, M., Schutten, R., . . . Lundberg, E. (2018). Deep learning is combined with massive-scale citizen science to improve large-scale image classification. Nature Biotechnology, 36(9), 820-+
Open this publication in new window or tab >>Deep learning is combined with massive-scale citizen science to improve large-scale image classification
Show others...
2018 (English)In: Nature Biotechnology, ISSN 1087-0156, E-ISSN 1546-1696, Vol. 36, no 9, p. 820-+Article in journal (Refereed) Published
Abstract [en]

Pattern recognition and classification of images are key challenges throughout the life sciences. We combined two approaches for large-scale classification of fluorescence microscopy images. First, using the publicly available data set from the Cell Atlas of the Human Protein Atlas (HPA), we integrated an image-classification task into a mainstream video game (EVE Online) as a mini-game, named Project Discovery. Participation by 322,006 gamers over 1 year provided nearly 33 million classifications of subcellular localization patterns, including patterns that were not previously annotated by the HPA. Second, we used deep learning to build an automated Localization Cellular Annotation Tool (Loc-CAT). This tool classifies proteins into 29 subcellular localization patterns and can deal efficiently with multi-localization proteins, performing robustly across different cell types. Combining the annotations of gamers and deep learning, we applied transfer learning to create a boosted learner that can characterize subcellular protein distribution with F1 score of 0.72. We found that engaging players of commercial computer games provided data that augmented deep learning and enabled scalable and readily improved image classification.

Place, publisher, year, edition, pages
NATURE PUBLISHING GROUP, 2018
National Category
Biological Sciences
Identifiers
urn:nbn:se:kth:diva-235602 (URN)10.1038/nbt.4225 (DOI)000443986000023 ()30125267 (PubMedID)2-s2.0-85053076602 (Scopus ID)
Note

QC 20181001

Available from: 2018-10-01 Created: 2018-10-01 Last updated: 2024-03-15Bibliographically approved
Thul, P., Åkesson, L., Axelsson, U., Bäckström, A., Danielsson, F., Gnann, C., . . . Lundberg, E. (2018). Multilocalizing Human Proteins. Molecular Biology of the Cell, 29(26)
Open this publication in new window or tab >>Multilocalizing Human Proteins
Show others...
2018 (English)In: Molecular Biology of the Cell, ISSN 1059-1524, E-ISSN 1939-4586, Vol. 29, no 26Article in journal, Meeting abstract (Other academic) Published
Place, publisher, year, edition, pages
AMER SOC CELL BIOLOGY, 2018
National Category
Biochemistry Molecular Biology
Identifiers
urn:nbn:se:kth:diva-303809 (URN)000505772701038 ()
Note

QC 20211021

Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2025-02-20Bibliographically approved
Thul, P., Åkesson, L., Mahdessian, D., Axelsson, U., Bäckström, A., Hjelmare, M., . . . Lundberg, E. (2018). The HPA Cell Atlas: Dissecting the spatiotemporal subcellular distribution of the human proteome.. Molecular Biology of the Cell, 29(26)
Open this publication in new window or tab >>The HPA Cell Atlas: Dissecting the spatiotemporal subcellular distribution of the human proteome.
Show others...
2018 (English)In: Molecular Biology of the Cell, ISSN 1059-1524, E-ISSN 1939-4586, Vol. 29, no 26Article in journal, Meeting abstract (Other academic) Published
Place, publisher, year, edition, pages
AMER SOC CELL BIOLOGY, 2018
National Category
Subatomic Physics
Identifiers
urn:nbn:se:kth:diva-303810 (URN)000505772701037 ()
Note

QC 20211021

Available from: 2021-10-21 Created: 2021-10-21 Last updated: 2023-12-07Bibliographically approved
Thul, P. J., Åkesson, L., Wiking, M., Mahdessian, D., Geladaki, A., Ait Blal, H., . . . Lundberg, E. (2017). A subcellular map of the human proteome. Science, 356(6340), Article ID 820.
Open this publication in new window or tab >>A subcellular map of the human proteome
Show others...
2017 (English)In: Science, ISSN 0036-8075, E-ISSN 1095-9203, Vol. 356, no 6340, article id 820Article in journal (Refereed) Published
Abstract [en]

Resolving the spatial distribution of the human proteome at a subcellular level can greatly increase our understanding of human biology and disease. Here we present a comprehensive image-based map of subcellular protein distribution, the Cell Atlas, built by integrating transcriptomics and antibody-based immunofluorescence microscopy with validation by mass spectrometry. Mapping the in situ localization of 12,003 human proteins at a single-cell level to 30 subcellular structures enabled the definition of the proteomes of 13 major organelles. Exploration of the proteomes revealed single-cell variations in abundance or spatial distribution and localization of about half of the proteins to multiple compartments. This subcellular map can be used to refine existing protein-protein interaction networks and provides an important resource to deconvolute the highly complex architecture of the human cell.

Place, publisher, year, edition, pages
American Association for the Advancement of Science, 2017
Keywords
antibody, proteome, biology, cells and cell components, disease incidence, image analysis, physiological response, protein, proteomics, spatial distribution, Article, cell organelle, cellular distribution, human, human cell, immunofluorescence microscopy, mass spectrometry, priority journal, protein analysis, protein localization, protein protein interaction, single cell analysis, transcriptomics
National Category
Cell Biology
Identifiers
urn:nbn:se:kth:diva-216588 (URN)10.1126/science.aal3321 (DOI)000401957900032 ()28495876 (PubMedID)2-s2.0-85019201137 (Scopus ID)
Note

QC 20171208

Available from: 2017-12-08 Created: 2017-12-08 Last updated: 2024-03-15Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-0028-5865

Search in DiVA

Show all publications