Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On Transcriptome Sequencing
KTH, School of Biotechnology (BIO), Gene Technology. (Division of Gene Technology)
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis is about the use of massive DNA sequencing to investigate the transcriptome. During recent decades, several studies have made it clear that the transcriptome comprises a more complex set of biochemical machinery than was previously believed. The majority of the genome can be expressed as transcripts; and overlapping and antisense transcription is widespread. New technologies for the interroga- tion of nucleic acids have made it possible to investigate such cellular phenomena in much greater detail than ever before. For each application, special requirements need to be met. The work presented in this thesis focuses on the transcrip- tome and the development of technology for its analysis. In paper I, we report our development of an automated approach for sample preparation. The procedure was benchmarked against a publicly available reference data set, and we note that our approach outperformed similar manual procedures in terms of reproducibility. In the work reported in papers II-IV, we used different massive sequencing technologies to investigate the transcriptome. In paper II we describe a concatemerization approach that increased throughput by 65% using 454 sequencing,and we identify classes of transcripts not previously described in Populus. Papers III and IV both report studies based on SOLiD sequencing. In the former, we investigated transcripts and proteins for 13% of the human gene and detected a massive overlap for the upper 50% transcriptional levels. In the work described in paper IV, we investigated transcription in non-genic regions of the genome and detected expression from a high number of previ- ously unknown loci.

Place, publisher, year, edition, pages
Stockholm: KTH , 2009. , 52 p.
Series
Trita-BIO-Report, ISSN 1654-2312 ; 2009:26
Keyword [en]
Transcriptome, RNA-seq, DNA sequencing, gene expression profiling, non-coding RNA, small RNA
National Category
Other Biological Topics Bioinformatics and Systems Biology Genetics Biochemistry and Molecular Biology Cell and Molecular Biology
Identifiers
URN: urn:nbn:se:kth:diva-11446ISBN: 978-91-7415-490-0 (print)OAI: oai:DiVA.org:kth-11446DiVA: diva2:278576
Public defence
2009-12-18, Oscar Klein (FR4), Roslagstullsbacken 21, Albanova University Center, Stockholm, 09:00 (Swedish)
Opponent
Supervisors
Note
QC 20100723Available from: 2009-12-03 Created: 2009-11-10 Last updated: 2010-07-23Bibliographically approved
List of papers
1. Automation of cDNA Synthesis and Labelling Improves Reproducibility
Open this publication in new window or tab >>Automation of cDNA Synthesis and Labelling Improves Reproducibility
Show others...
2009 (English)In: Journal of Biomedicine and Biotechnology, ISSN 1110-7243, E-ISSN 1110-7251, Vol. 2009, 396808- p.Article in journal (Refereed) Published
Abstract [en]

Background. Several technologies, such as in-depth sequencing and microarrays, enable large-scale interrogation of genomes and transcriptomes. In this study, we asses reproducibility and throughput by moving all laboratory procedures to a robotic workstation, capable of handling superparamagnetic beads. Here, we describe a fully automated procedure for cDNA synthesis and labelling for microarrays, where the purification steps prior to and after labelling are based on precipitation of DNA on carboxylic acid-coated paramagnetic beads. Results. The fully automated procedure allows for samples arrayed on a microtiter plate to be processed in parallel without manual intervention and ensuring high reproducibility. We compare our results to a manual sample preparation procedure and, in addition, use a comprehensive reference dataset to show that the protocol described performs better than similar manual procedures. Conclusions. We demonstrate, in an automated gene expression microarray experiment, a reduced variance between replicates, resulting in an increase in the statistical power to detect differentially expressed genes, thus allowing smaller differences between samples to be identified. This protocol can with minor modifications be used to create cDNA libraries for other applications such as in-depth analysis using next-generation sequencing technologies.

Keyword
control maqc project; microarray; gene; platforms; database; ncbi
National Category
Other Biological Topics
Identifiers
urn:nbn:se:kth:diva-11442 (URN)10.1155/2009/396808 (DOI)000271697700001 ()
Note
QC 20100723Available from: 2009-11-10 Created: 2009-11-10 Last updated: 2017-12-12Bibliographically approved
2. Genome-wide profiling of Populus small RNAs
Open this publication in new window or tab >>Genome-wide profiling of Populus small RNAs
Show others...
2009 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 10, Article number 620- p.Article in journal (Refereed) Published
Abstract [en]

Background: Short RNAs, and in particular microRNAs, are important regulators of gene expression both within defined regulatory pathways and at the epigenetic scale. We investigated the short RNA (sRNA) population (18-24 nt) of the transcriptome of green leaves from the sequenced Populus trichocarpa using a concatenation strategy in combination with 454 sequencing. Results: The most abundant size class of sRNAs were 24 nt and these were generally associated with a number of classes of retrotransposons and repetitive elements. Some repetitive elements were also associated with 22 nt RNAs. We identified an sRNA hot-spot on chromosome 19, overlapping a region containing both the sex-determining loci and a major cluster of NBS-LRR genes. A number of phased siRNA loci were identified, a subset of which are predicted to target PPR and NBS-LRR disease resistance genes, classes of genes that have been significantly expanded in Populus. Additional loci enriched for sRNA production were identified. We identified 15 novel predicted microRNAs (miRNAs), including miRNA∗ sequences, and identified a novel locus that may encode a dual miRNA or a miRNA and short interfering RNAs (siRNAs). Conclusions: The short RNA population of P. trichocarpa is at least as complex as that of Arabidopsis. We provide a first genome-wide view of short RNA production for P. trichocarpa and identify new, non-conserved miRNAs.

Keyword
stress-responsive micrornas; dna-methylation; arabidopsis-thaliana; sirna biogenesis; repetitive elements; mirna genes; trichocarpa; plants; database; expression
National Category
Other Biological Topics
Identifiers
urn:nbn:se:kth:diva-11443 (URN)10.1186/1471-2164-10-620 (DOI)000273971100001 ()2-s2.0-75449114906 (Scopus ID)
Note
QC 20100723Available from: 2009-11-10 Created: 2009-11-10 Last updated: 2017-12-12Bibliographically approved
3. Analysis of transcript and protein overlap in a human osteosarcoma cell line
Open this publication in new window or tab >>Analysis of transcript and protein overlap in a human osteosarcoma cell line
Show others...
2010 (English)In: BMC Genomics, ISSN 1471-2164, E-ISSN 1471-2164, Vol. 11, no 1, 684- p.Article in journal (Refereed) Published
Abstract [en]

Background: An interesting field of research in genomics and proteomics is to compare the overlap between the transcriptome and the proteome. Recently, the tools to analyse gene and protein expression on a whole-genome scale have been improved, including the availability of the new generation sequencing instruments and high-throughput antibody-based methods to analyze the presence and localization of proteins. In this study, we used massive transcriptome sequencing (RNA-seq) to investigate the transcriptome of a human osteosarcoma cell line and compared the expression levels with in situ protein data obtained in-situ from antibody-based immunohistochemistry (IHC) and immunofluorescence microscopy (IF).

Results: A large-scale analysis based on 2749 genes was performed, corresponding to approximately 13% of the protein coding genes in the human genome. We found the presence of both RNA and proteins to a large fraction of the analyzed genes with 60% of the analyzed human genes detected by all three methods. Only 34 genes (1.2%) were not detected on the transcriptional or protein level with any method. Our data suggest that the majority of the human genes are expressed at detectable transcript or protein levels in this cell line. Since the reliability of antibodies depends on possible cross-reactivity, we compared the RNA and protein data using antibodies with different reliability scores based on various criteria, including Western blot analysis. Gene products detected in all three platforms generally have good antibody validation scores, while those detected only by antibodies, but not by RNA sequencing, generally consist of more low-scoring antibodies.

Conclusion: This suggests that some antibodies are staining the cells in an unspecific manner, and that assessment of transcript presence by RNA-seq can provide guidance for validation of the corresponding antibodies.

Keyword
antibody detection, article, cancer cell culture, controlled study, cross reaction, gene expression, gene identification, genetic association, genetic transcription, human, human cell, immunofluorescence microscopy, immunohistochemistry, osteosarcoma cell, RNA sequence, Western blotting
National Category
Other Biological Topics
Identifiers
urn:nbn:se:kth:diva-11444 (URN)10.1186/1471-2164-11-684 (DOI)000285887400001 ()2-s2.0-78649536411 (Scopus ID)
Funder
Knut and Alice Wallenberg FoundationSwedish Research Council
Note
QC 20100723 Uppdaterad från manuskript till artikel (20110207).Available from: 2009-11-10 Created: 2009-11-10 Last updated: 2017-12-12Bibliographically approved
4. In-depth transcriptome analysis reveals novel TARs and prevalent antisense transcription in human cell lines.
Open this publication in new window or tab >>In-depth transcriptome analysis reveals novel TARs and prevalent antisense transcription in human cell lines.
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Several recent studies have indicated that transcription is pervasive in regions outside of protein coding genes and that short antisense transcripts can originate from the promoter and terminator regions of genes. Here we investigate transcription of fragments longer than 200 nucleotides, focusing on antisense transcription for known protein coding genes and intergenic transcription. We find that roughly 12% to 16% of all reads that originate from promoter and terminator regions, respectively, map antisense to the gene in question. Furthermore, we detect a high number of novel transcriptionally active regions (TARs) that are generally expressed at a lower level than protein coding genes. We also investigate the correlation between RNA-seq data and microarray data and conclude that the correlation is dependant on gene length such that longer genes show a better correlation.

Keyword
Transcriptome, non-coding RNA, DNA sequencing, next generation sequencing
National Category
Other Biological Topics
Identifiers
urn:nbn:se:kth:diva-11445 (URN)
Note
QC 20100723Available from: 2009-11-10 Created: 2009-11-10 Last updated: 2010-07-23Bibliographically approved

Open Access in DiVA

fulltext(1466 kB)1166 downloads
File information
File name FULLTEXT01.pdfFile size 1466 kBChecksum SHA-512
a5969978b889a9824e4894d4d7d28d411425c6aab6e5ff6f8906ecb29b82d3c11808b5cb113fd207e4d1a99541adfd7e5d867fba901f33b128f6e206aed92789
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Klevebring, Daniel
By organisation
Gene Technology
Other Biological TopicsBioinformatics and Systems BiologyGeneticsBiochemistry and Molecular BiologyCell and Molecular Biology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1166 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 502 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf