Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Analysis of public RNA-sequencing data reveals biological consequences of genetic heterogeneity in cell line populations
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Protein Science, Systems Biology.ORCID iD: 0000-0003-0492-9960
KTH, School of Engineering Sciences in Chemistry, Biotechnology and Health (CBH), Protein Science, Systems Biology.ORCID iD: 0000-0001-6990-1905
2018 (English)In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 8, article id 11226Article in journal (Refereed) Published
Abstract [en]

Meta-analysis of datasets available in public repositories are used to gather and summarise experiments performed across laboratories, as well as to explore consistency of scientific findings. As data quality and biological equivalency across samples may obscure such analyses and consequently their conclusions, we investigated the comparability of 85 public RNA-seq cell line datasets. Thousands of pairwise comparisons of single nucleotide variants in 139 samples revealed variable genetic heterogeneity of the eight cell line populations analysed as well as variable data quality. The H9 and HCT116 cell lines were found to be remarkably stable across laboratories (with median concordances of 99.2% and 98.5%, respectively), in contrast to the highly variable HeLa cells (89.3%). We show that the genetic heterogeneity encountered greatly affects gene expression between same-cell comparisons, highlighting the importance of interrogating the biological equivalency of samples when comparing experimental datasets. Both the number of differentially expressed genes and the expression levels negatively correlate with the genetic heterogeneity. Finally, we demonstrate how comparing genetically heterogeneous datasets affect gene expression analyses and that high dissimilarity between same-cell datasets alters the expression of more than 300 cancer-related genes, which are often the focus of studies using cell lines.

Place, publisher, year, edition, pages
Nature Publishing Group, 2018. Vol. 8, article id 11226
National Category
Medical Genetics
Identifiers
URN: urn:nbn:se:kth:diva-232882DOI: 10.1038/s41598-018-29506-3ISI: 000439686700049PubMedID: 30046134Scopus ID: 2-s2.0-85050698721OAI: oai:DiVA.org:kth-232882DiVA, id: diva2:1237673
Note

QC 20180809

Available from: 2018-08-09 Created: 2018-08-09 Last updated: 2019-05-15Bibliographically approved
In thesis
1. Exploring genetic heterogeneity in cancer using high-throughput DNA and RNA sequencing
Open this publication in new window or tab >>Exploring genetic heterogeneity in cancer using high-throughput DNA and RNA sequencing
2018 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

High-throughput sequencing (HTS) technology has revolutionised the biomedical sciences, where it is used to analyse the genetic makeup and gene expression patterns of both primary patient tissue samples and models cultivated in vitro. This makes it especially useful for research on cancer, a disease that is characterised by its deadliness and genetic heterogeneity. This inherent genetic variation is an important aspect that warrants exploration, and the depth and breadth that HTS possesses makes it well-suited to investigate this facet of cancer.

The types of analyses that may be accomplished with HTS technologies are many, but they may be divided into two groups: those that analyse the DNA of the sample in question, and those that work on the RNA. While DNA-based methods give information regarding the genetic landscape of the sample, RNA-based analyses yield data regarding gene expression patterns; both of these methods have already been used to investigate the heterogeneity present in cancer. While RNA-based methods are traditionally used exclusively for expression analyses, the data they yield may also be utilised to investigate the genetic variation present in the samples. This type of RNA-based analysis is seldom performed, however, and valuable information is thus ignored.

The aim of this thesis is the development and application of DNA- and RNA- based HTS methods for analysing genetic heterogeneity within the context of cancer. The present investigation demonstrates that not only may RNA-based sequencing be used to successfully differentiate different in vitro cancer models through their genetic makeup, but that this may also be done for primary patient data. A pipeline for these types of analyses is established and evaluated, showing it to be both robust to several technical parameters as well as possess a broad scope of analytical possibilities. Genetic variation within cancer models in public databases are evaluated and demonstrated to affect gene expression in several cases. Both inter- and intra-patient genetic heterogeneity is shown using the established pipeline, in addition to demonstrating that cancerous cells are more heterogeneous than their normal neighbours. Finally, two bioinformatic open source software packages are presented.

The results presented herein demonstrate that genetic analyses using RNA-based methods represent excellent complements to already existing DNA-based techniques, and further increase the already large scope of how HTS technologies may be utilised.

Place, publisher, year, edition, pages
Stockholm: Kungliga tekniska högskolan, 2018. p. 83
Series
TRITA-CBH-FOU ; 2018:31
Keywords
Biotechnology, bioinformatics, RNA-seq, WGS, WES, systems biology, variant analysis, single nucleotide variant, gene expression, machine learning, clustering, open source, R, bioconductor, Python
National Category
Medical Biotechnology Bioinformatics and Systems Biology
Research subject
Biotechnology
Identifiers
urn:nbn:se:kth:diva-234265 (URN)978-91-7729-918-9 (ISBN)
Public defence
2018-10-05, FR4, Oskar Klein's Auditorium, Albanova, Stockholm, 10:00 (English)
Opponent
Supervisors
Note

QC 20180906

Available from: 2018-09-06 Created: 2018-09-05 Last updated: 2018-09-06Bibliographically approved

Open Access in DiVA

fulltext(1756 kB)16 downloads
File information
File name FULLTEXT01.pdfFile size 1756 kBChecksum SHA-512
9f7951b6d76c974d1fede993798107f161bb1c9c607367f8ab0b83b20e2579d90101d2b9dd31d677afae45c986bcadfe19829e0868ad0207ccbe8a1132b1e19d
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMedScopus

Authority records BETA

Fasterius, ErikAl-Khalili Szigyarto, Cristina

Search in DiVA

By author/editor
Fasterius, ErikAl-Khalili Szigyarto, Cristina
By organisation
Systems Biology
In the same journal
Scientific Reports
Medical Genetics

Search outside of DiVA

GoogleGoogle Scholar
Total: 16 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 283 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf