Lognormality and oscillations in the coverage of high-throughput transcriptomic data towards gene ends
2013 (English)In: Journal of Statistical Mechanics: Theory and Experiment, ISSN 1742-5468, Vol. 2013, no 10, P10013- p.Article in journal (Refereed) Published
High-throughput transcriptomics experiments have reached the stage where the count of the number of reads alignable to a given position can be treated as an almost-continuous signal. This allows us to ask questions of biophysical/biotechnical nature, but which may still have biological implications. Here we show that when sequencing RNA fragments from one end, as is the case on most platforms, an oscillation in the read count is observed at the other end. We further show that these oscillations can be well described by Kolmogorov's 1941 broken stick model. We investigate how the model can be used to improve predictions of gene ends (3' transcript ends), but conclude that with present data the improvement is only marginal. The results highlight subtle effects in high-throughput transcriptomics experiments which do not have a biological origin, but which may still be used to obtain biological information.
Place, publisher, year, edition, pages
Institute of Physics (IOP), 2013. Vol. 2013, no 10, P10013- p.
Modelling, Artefacts, RNAseq, Log-normal distribution, Broken stick, 3'end of transcripts
Bioinformatics and Systems Biology Other Physics Topics
Research subject SRA - Molecular Bioscience
IdentifiersURN: urn:nbn:se:kth:diva-136077DOI: 10.1088/1742-5468/2013/10/P10013ISI: 000326869000014ScopusID: 2-s2.0-84888614314OAI: oai:DiVA.org:kth-136077DiVA: diva2:670501
QC 201312202013-12-032013-12-032015-09-30Bibliographically approved