Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Whole-genome mapping of 5′ RNA ends in bacteria by tagged sequencing: a comprehensive view in Enterococcus faecalis
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB. (Computational Biological Physics, CBP)
Department of Biosystems Science and Engineering, ETH Zürich, CH-4058, Basel, Switzerland.
KTH, School of Computer Science and Communication (CSC), Computational Biology, CB. Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362, Esch-sur-Alzette, Luxembourg.
INRA, UMR1319 Micalis, Domaine de Vilvert, F-78352, Jouy-en-Josas, France.
Show others and affiliations
2015 (English)In: RNA, ISSN 1355-8382Article in journal (Refereed) Published
Abstract [en]

Enterococcus faecalis is the third cause of nosocomial infections. To obtain the first snapshot of transcriptional organizations in this bacterium, we used a modified RNA-seq approach enabling to discriminate primary from processed 5' RNA ends. We also validated our approach by confirming known features in Escherichia coli. We mapped 559 transcription start sites (TSSs) and 352 processing sites (PSSs) in E. faecalis. A blind motif search retrieved canonical features of SigA-and SigN-dependent promoters preceding transcription start sites mapped. We discovered 85 novel putative regulatory RNAs, small-and antisense RNAs, and 72 transcriptional antisense organizations. Presented data constitute a significant insight into bacterial RNA landscapes and a step toward the inference of regulatory processes at transcriptional and post-transcriptional levels in a comprehensive manner.

Place, publisher, year, edition, pages
RNA Society , 2015.
Keyword [en]
primary RNA, processed RNA, promoter, RNA degradation, Enterococcus faecalis
National Category
Bioinformatics and Systems Biology Microbiology Genetics
Research subject
Biological Physics
Identifiers
URN: urn:nbn:se:kth:diva-163570DOI: 10.1261/rna.048470.114ISI: 000353068400022Scopus ID: 2-s2.0-84928006918OAI: oai:DiVA.org:kth-163570DiVA: diva2:801237
Funder
Swedish Research Council, 621-2012-2982
Note

QC 20150417

Available from: 2015-04-08 Created: 2015-04-08 Last updated: 2015-09-30Bibliographically approved
In thesis
1. Data Analysis and Next Generation Sequencing : Applications in Microbiology.
Open this publication in new window or tab >>Data Analysis and Next Generation Sequencing : Applications in Microbiology.
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Next Generation Sequencing (NGS) is a new technology that has revolutionized the way we study living organisms. Where previously only a few genes could be studied at a time through targeted direct probing, NGS offers the possibility to perform measurements for a whole genome at once. The drawback is that the amount of data generated in the process is large and extracting useful information from it requires new methods to process and analyze it.

The main contribution of this thesis is the development of a novel experimental method coined tagRNA-seq, combining 5’tagRACE, a previously developed technique, with RNA-sequencing technology. Briefly, tagRNA-seq makes it possible to identify the 5’ ends of RNAs in bacteria and directly probe for their type, primary or processed, by ligating short RNA sequences, the tags, to the beginnings of RNA molecules. We used the method to directly probe for transcription start and processing sites in two bacterial species, Escherichiacoli and Enterococcus faecalis. It was also used to study polyadenylation in E. coli, where the ability to identify processed RNA molecules proved to be useful to separate direct and indirect regulatory effects of this mechanism. We also demonstrate how data from tagRNA-seq experiments can be used to increase confidence on the discovery of anti-sense transcripts in bacteria. Analyses of RNA-seq data obtained in the context of these experiments revealed subtle artifacts in the coverage signal towards gene ends, that we were able to explain and quantify based Kolmogorov’s broken stick model. We also discovered evidences for circularization of a few RNA transcripts, both in our own data sets and publicly available data.

Designing the tags used in tagRNA-seq led us to the problem of words absent from a text. We focus on a particular subset of these, the minimal absent words (MAWs), and develop a theory providing a complete description of their size distribution in random text. We also show that MAWs in genomes from viruses and living organisms almost always exhibit a behavior different from random texts in the tail of the distribution, and that MAWs from this tail are closely related to sequences present in the genome that preferentially appear in regions with important regulatory functions.

Finally, and independently from tagRNA-seq, we propose a new approach to the problem of bacterial community reconstruction in metagenomic, based on techniques from compressed sensing. We provide a novel algorithm competing with state-of-the-art techniques in the field.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2015. xviii, 154 p.
Series
TRITA-CSC-A, ISSN 1653-5723 ; 2015:15
Keyword
RNA-seq, tagRNA-seq, primary and processed RNA, Enterococcus faecalis, Complex transcription, Metagenomics, 5'tagRACE, minimal absent words, compressed sensing, metagenomics, bacterial community reconstruction
National Category
Bioinformatics (Computational Biology) Microbiology Other Biological Topics Genetics
Research subject
Biological Physics
Identifiers
urn:nbn:se:kth:diva-173219 (URN)978-91-7595-699-2 (ISBN)
Public defence
2015-10-30, FA32, Roslagstullsbacken 21, Stockholm, 14:00 (English)
Opponent
Supervisors
Note

QC 20150930

Available from: 2015-09-30 Created: 2015-09-07 Last updated: 2015-11-06Bibliographically approved

Open Access in DiVA

fulltext(2544 kB)30 downloads
File information
File name FULLTEXT01.pdfFile size 2544 kBChecksum SHA-512
356213cf1787791d4cf139b5449814cd52d4be294701e0cd89386228caeaf725c62f74adef53f239b0d3a025cdc318c03daac04522699b941e526cbbaeb889a6
Type fulltextMimetype application/pdf
Supplemental material(824 kB)66 downloads
File information
File name ATTACHMENT02.pdfFile size 824 kBChecksum SHA-512
e78fad93168bfe59cb8162f7979f5cf07c6e80f3ca3e4bad5e29cdf589e5ab942f8d4cfa875fbdb08b3e4bb4d044c5a10fd3256b19316092bf1c79314e31f36b
Type attachmentMimetype application/pdf
SupplementayTableSF NovelTranscripts(81 kB)12 downloads
File information
File name ATTACHMENT03.zipFile size 81 kBChecksum SHA-512
4355beb79a664e0e18a0e5b2a8822a8781f7074cc6de74c582f01ae4482a896d602b3cd563b4dac794048344a441e0d69a63c511133d8819c6ac3e1125fe01a1
Type attachmentMimetype application/zip
SupplementaryTableSE-PSSs in ppRNome(84 kB)10 downloads
File information
File name FULLTEXT02.zipFile size 84 kBChecksum SHA-512
ace6902407f3c1ecc2ad5bb8ecdd8cfc99f0ffa8fba37046099beb53dfaf6de9cca64ed534c90e5aae0735a0f80a7b41a81d0bad17169808df102d4fb56b2c24
Type attachmentMimetype application/zip
SupplementaryTableSD-RNAlevels(519 kB)12 downloads
File information
File name ATTACHMENT04.zipFile size 519 kBChecksum SHA-512
be0594b6aaedd893fbeaafbd70f3504f1ea607bed1533069e61f51bd331267be8d6f202c09add695dc55ac586f0ac988ce59a1ff62c141d543e2819cc13bd6a1
Type attachmentMimetype application/zip
SupplementaryTableSC-MEMEandTRANSTERM(836 kB)12 downloads
File information
File name ATTACHMENT05.zipFile size 836 kBChecksum SHA-512
37b1ed54371a82a0d2b69f15ab3ede109c0e77cb620c84cafbd1a9bff338b6e224c530ea0627fca98e660688dd8bce614a27ad449bfd707d9f33f3b79ba0a422
Type attachmentMimetype application/zip
SupplementaryTableSB-TSSs(101 kB)10 downloads
File information
File name ATTACHMENT06.zipFile size 101 kBChecksum SHA-512
52df07d806a01094cc3cb762f8bfa58949545c7d5c92954cf9745e3e94abf64bc1bb21f23a9fdd54f0cf5093827dd5438332f8df4dddc61e3ade099bd6520163
Type attachmentMimetype application/zip
SupplementaryTableSA-tagRNAseqdata(855 kB)16 downloads
File information
File name ATTACHMENT07.zipFile size 855 kBChecksum SHA-512
42fafcec6b70828be71a1d916f9d85b1b1313a99bc19749cc7cd1f5213c13a086e7167cad3e9d3d8372afca44461cce30202f1d28b75a73409954f3f7635d090
Type attachmentMimetype application/zip

Other links

Publisher's full textScopusArXiv preprintPublisher's website

Search in DiVA

By author/editor
Innocenti, NicolasFouquier d'Hérouël, AymericAurell, Erik
By organisation
Computational Biology, CB
Bioinformatics and Systems BiologyMicrobiologyGenetics

Search outside of DiVA

GoogleGoogle Scholar
Total: 40 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 131 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf