kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Mapping and annotating the mammalian body-wide protein-coding gene expression
KTH, Centres, Science for Life Laboratory, SciLifeLab. KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Protein Science, Systems Biology. (Integrative omics and precision medicine)ORCID iD: 0000-0002-7000-4416
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

A central aim of fundamental research is to create conditions necessary for fueling further research and innovation. Our understanding of basic biology is central for future developments of tools for diagnosing, monitoring, and treating disease. This doctoral thesis focuses on mapping the mammalian protein-coding gene expression in healthy cells and tissues, and annotation of genes based on their expression patterns, specificity, location, and function. This has in large part been achieved by using large scale transcriptomic and proteomic profiling to describe the gene expression landscape that defines the identities of the great diversity of cells present in mammals. Characterization of gene expression across different tissues and cell types provide fundamental tools to enable the exploration, summary, and ultimately, the annotation of the mammalian proteome, which is still incomplete.

The studies comprising this thesis have contributed to the Human Protein Atlas, an online open-access portal for proteomic and transcriptomic data, with the aim to profile each human protein-coding gene to create a spatial map of the molecular organization of the human body, providing basic tools for the scientific community. Paper I comprises an effort to catalogue all proteins that are actively secreted from cells; defining the human secretome. Paper II entails the deep characterization and annotation of the protein-coding transcriptome of 18 peripheral immune cell types. Paper III describes the, to date, most comprehensive tissue-based transcriptomic profiling of protein-coding genes in 98 tissues of the increasingly important model animal pig. Paper IV extends previous tissue-based maps of the human protein-coding genome by integration of 13 single cell transcriptome datasets. Paper V explores the human protein-coding genome in a clustering-based annotation of co-expressed genes across single cells and tissues to provide a framework for finding previously unknown functional relationships between genes by the principle of “guilt-by-association”.

In summary, the work described here entails a small contribution to the grand effort of spatially mapping proteins across tissues and cell types, for building a framework of biological knowledge that can lead to increased understanding of the constituents that make us humans.

Abstract [sv]

Ett centralt mål för grundvetenskap är att skapa förutsättningar för framtida forskning och innovation. Vår förståelse för grundläggande biologi är essentiell för utveckling av verktyg för diagnosticering, uppföljning, och behandling av sjukdomar. Denna avhandling fokuserar på kartläggningen av det proteinkodande genuttrycket hos däggdjur i friska celler och vävnader, samt annoteringen av gener baserat på deras uttrycksmönster, specificitet, lokalisering, och funktion. Detta har till stor del uppnåtts genom storskalig transkriptomik- och proteomik-baserad profilering för att beskriva de genuttrycksmönster som definierar de identiteter den stora mångfalden av celler som finns i däggdjur. Karaktäriseringen av genuttryck bland vävnader och celltyper utgör viktiga verktyg för att möjliggöra utforskning, sammanställande, och slutligen, annoteringen av däggdjurs proteom som fortfarande är ofullständig. 

Studierna som utgör denna avhandling har bidragit till the Human Protein Atlas; en online-portal med fri tillgång för proteomik- och transkriptomikdata, med en målsättning att beskriva uttrycket av samtliga proteinkodande gener. Genom att skapa en karta av den molekylära organiseringen av människokroppen utgör detta projekt ett väsentligt verktyg för forskning. Artikel I utgör en katalogisering av alla proteiner som aktivt sekreteras från celler, för att definiera det mänskliga sekretomet. Artikel II handlar om en djup karaktärisering och annotering av det proteinkodande transkriptomet hos 18 perifera immuncelltyper. Artikel III beskriver den, till dagens datum, mest omfattande vävnadsbaserade kartan av proteinkodande gener i 98 vävnader i gris, som har blivit en allt viktigare modellorganism. Artikel IV utvidgar de tidigare vävnadsbaserade kartor av det proteinkodande genomet, genom att integrera 13 encellstranskriptomik-dataset. Artikel V utforskar det mänskliga proteinkodande genomet i en klustringsbaserad annotering av samuttryckta gener, för att bygga ett ramverk för att hitta tidigare okända funktionella samband mellan gener, enligt principen av ”associationsskuld”.

Arbetet beskrivet här utgör ett bidrag till det omfattande arbetet att kartlägga proteiners lokalisering i vävnader och celler, för att bygga ett ramverk av biologisk kunskap som kan leda till ökad förståelse för komponenterna som gör oss till människor. 

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2022. , p. 75
Series
TRITA-CBH-FOU ; 2022:32
Keywords [en]
protein, protein-coding, genes, annotation, atlas, scRNA-Seq, RNA-Seq, transcriptomics, proteomics
National Category
Bioinformatics and Computational Biology
Research subject
Biotechnology
Identifiers
URN: urn:nbn:se:kth:diva-312033ISBN: 978-91-8040-250-7 (print)OAI: oai:DiVA.org:kth-312033DiVA, id: diva2:1657352
Public defence
2022-06-03, Eva & Georg Klein, Biomedicum, Solnavägen 9, via Zoom: https://kth-se.zoom.us/j/66922122998, Solna, 09:30 (English)
Opponent
Supervisors
Funder
Knut and Alice Wallenberg Foundation
Note

QC 2022-05-11

Available from: 2022-05-11 Created: 2022-05-10 Last updated: 2025-02-07Bibliographically approved
List of papers
1. The human secretome
Open this publication in new window or tab >>The human secretome
Show others...
2019 (English)In: Science Signaling, ISSN 1945-0877, E-ISSN 1937-9145, Vol. 12, no 609, article id eaaz0274Article in journal (Refereed) Published
Abstract [en]

The proteins secreted by human cells (collectively referred to as the secretome) are important not only for the basic understanding of human biology but also for the identification of potential targets for future diagnostics and therapies. Here, we present a comprehensive analysis of proteins predicted to be secreted in human cells, which provides information about their final localization in the human body, including the proteins actively secreted to peripheral blood. The analysis suggests that a large number of the proteins of the secretome are not secreted out of the cell, but instead are retained intracellularly, whereas another large group of proteins were identified that are predicted to be retained locally at the tissue of expression and not secreted into the blood. Proteins detected in the human blood by mass spectrometry-based proteomics and antibody-based immuno-assays are also presented with estimates of their concentrations in the blood. The results are presented in an updated version 19 of the Human Protein Atlas in which each gene encoding a secretome protein is annotated to provide an open-access knowledge resource of the human secretome, including body-wide expression data, spatial localization data down to the single-cell and subcellular levels, and data about the presence of proteins that are detectable in the blood.

Place, publisher, year, edition, pages
NLM (Medline), 2019
National Category
Biochemistry Molecular Biology Cell Biology
Identifiers
urn:nbn:se:kth:diva-265462 (URN)10.1126/scisignal.aaz0274 (DOI)000499099300003 ()31772123 (PubMedID)2-s2.0-85075677906 (Scopus ID)
Note

QC 20191218

Available from: 2019-12-18 Created: 2019-12-18 Last updated: 2025-02-20Bibliographically approved
2. A genome-wide transcriptomic analysis of protein-coding genes in human blood cells
Open this publication in new window or tab >>A genome-wide transcriptomic analysis of protein-coding genes in human blood cells
Show others...
2019 (English)In: Science, ISSN 0036-8075, E-ISSN 1095-9203, Vol. 366, no 6472, p. 1471-+, article id eaax9198Article in journal (Refereed) Published
Abstract [en]

Blood is the predominant source for molecular analyses in humans, both in clinical and research settings. It is the target for many therapeutic strategies, emphasizing the need for comprehensive molecular maps of the cells constituting human blood. In this study, we performed a genome-wide transcriptomic analysis of protein-coding genes in sorted blood immune cell populations to characterize the expression levels of each individual gene across the blood cell types. All data are presented in an interactive, open-access Blood Atlas as part of the Human Protein Atlas and are integrated with expression profiles across all major tissues to provide spatial classification of all protein-coding genes. This allows for a genome-wide exploration of the expression profiles across human immune cell populations and all major human tissues and organs.

Place, publisher, year, edition, pages
American Association for the Advancement of Science, 2019
National Category
Genetics and Genomics
Identifiers
urn:nbn:se:kth:diva-266527 (URN)10.1126/science.aax9198 (DOI)000503861000045 ()31857451 (PubMedID)2-s2.0-85077091174 (Scopus ID)
Note

QC 20200205

Available from: 2020-02-05 Created: 2020-02-05 Last updated: 2025-02-07Bibliographically approved
3. Genome-wide annotation of protein-coding genes in pig
Open this publication in new window or tab >>Genome-wide annotation of protein-coding genes in pig
Show others...
2022 (English)In: BMC Biology, E-ISSN 1741-7007, Vol. 20, no 1, article id 25Article in journal (Refereed) Published
Abstract [en]

Background: There is a need for functional genome-wide annotation of the protein-coding genes to get a deeper understanding of mammalian biology. Here, a new annotation strategy is introduced based on dimensionality reduction and density-based clustering of whole-body co-expression patterns. This strategy has been used to explore the gene expression landscape in pig, and we present a whole-body map of all protein-coding genes in all major pig tissues and organs. Results: An open-access pig expression map (www.rnaatlas.org ) is presented based on the expression of 350 samples across 98 well-defined pig tissues divided into 44 tissue groups. A new UMAP-based classification scheme is introduced, in which all protein-coding genes are stratified into tissue expression clusters based on body-wide expression profiles. The distribution and tissue specificity of all 22,342 protein-coding pig genes are presented. Conclusions: Here, we present a new genome-wide annotation strategy based on dimensionality reduction and density-based clustering. A genome-wide resource of the transcriptome map across all major tissues and organs in pig is presented, and the data is available as an open-access resource (www.rnaatlas.org), including a comparison to the expression of human orthologs.

Place, publisher, year, edition, pages
Springer Nature, 2022
Keywords
Annotation, Protein coding genes, Genome wide, Transcriptome, Gene expression, Tissue expression profile
National Category
Biochemistry Molecular Biology Medical Biotechnology Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:kth:diva-307759 (URN)10.1186/s12915-022-01229-y (DOI)000746863800002 ()35073880 (PubMedID)2-s2.0-85123754738 (Scopus ID)
Note

QC 20220209

Available from: 2022-02-09 Created: 2022-02-09 Last updated: 2025-02-20Bibliographically approved
4. A single-cell type transcriptomics map of human tissues
Open this publication in new window or tab >>A single-cell type transcriptomics map of human tissues
Show others...
2021 (English)In: Science Advances, E-ISSN 2375-2548, Vol. 7, no 31, article id eabh2169Article in journal (Refereed) Published
Abstract [en]

Advances in molecular profiling have opened up the possibility to map the expression of genes in cells, tissues, and organs in the human body. Here, we combined single-cell transcriptomics analysis with spatial antibody-based protein profiling to create a high-resolution single-cell type map of human tissues. An open access atlas has been launched to allow researchers to explore the expression of human protein-coding genes in 192 individual cell type clusters. An expression specificity classification was performed to determine the number of genes elevated in each cell type, allowing comparisons with bulk transcriptomics data. The analysis highlights distinct expression clusters corresponding to cell types sharing similar functions, both within the same organs and between organs.

Place, publisher, year, edition, pages
American Association for the Advancement of Science (AAAS), 2021
National Category
Biochemistry Molecular Biology
Identifiers
urn:nbn:se:kth:diva-299689 (URN)10.1126/sciadv.abh2169 (DOI)000678723800005 ()34321199 (PubMedID)2-s2.0-85111485342 (Scopus ID)
Note

QC 20210817

Available from: 2021-08-17 Created: 2021-08-17 Last updated: 2025-02-20Bibliographically approved
5. Genome-wide single cell annotation of the human protein-coding genes
Open this publication in new window or tab >>Genome-wide single cell annotation of the human protein-coding genes
Show others...
(English)Manuscript (preprint) (Other academic)
Abstract [en]

An important quest for the life science community is to deliver a complete annotation of the human building-blocks of life, the genes and the proteins. Here, we report on a genome-wide effort to annotate all protein-coding genes based on single cell transcriptomics data representing all major tissues and organs in the human body, integrated with data from bulk transcriptomics and antibody-based tissue profiling. Altogether, 25 tissues have been analyzed with single cell transcriptomics resulting in genome-wide expression in 444 single cell types using a strategy involving pooling data from individual cells to obtain genome-wide expression profiles of individual cell type. We introduce a new genome-wide classification tool based on clustering of similar expression profiles across single cell types, which can be visualized using dimensional reduction maps (UMAP). The clustering classification is integrated with a new “tau” score classification for all protein-coding genes, resulting in a measure of single cell specificity across all cell types for all individual genes. The analysis has allowed us to annotate all human protein-coding genes with regards to function and spatial distribution across individual cell types across all major tissues and organs in the human body. A new version of the open access Human Protein Atlas (www.proteinatlas.org) has been launched to enable researchers to explore the new genome-wide annotation on an individual gene level.

Keywords
protein, annotation, clustering, specificity, tissue, single-cell, RNA-Seq, scRNA-Seq
National Category
Bioinformatics and Computational Biology
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-312021 (URN)
Note

QC 20220524

Available from: 2022-05-09 Created: 2022-05-09 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

summary(2335 kB)477 downloads
File information
File name SUMMARY01.pdfFile size 2335 kBChecksum SHA-512
51f5a333154b8b7bb9df522ec0d182285721423fa704f88bf270f2e2a9682aaabd24654ffcf8a99284fea63992b0693a800d39f7f9f155b4b715684cfebdcea7
Type fulltextMimetype application/pdf

Authority records

Karlsson, Max

Search in DiVA

By author/editor
Karlsson, Max
By organisation
Science for Life Laboratory, SciLifeLabSystems Biology
Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 0 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1003 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf