基于总 RNA 测序的 RNA 聚合酶 II 延伸速度的统计推断。

Statistical inference of the rate of RNA polymerase II elongation by total RNA sequencing.

机构信息

Department of Statistical Science, The Graduate University for Advanced Studies (SOKENDAI), Tachikawa, Japan.

Department of Statistical Modeling, The Institute of Statistical Mathematics, Research Organization of Information and Systems, Tachikawa, Japan.

出版信息

Bioinformatics. 2019 Jun 1;35(11):1877-1884. doi: 10.1093/bioinformatics/bty886.

DOI:10.1093/bioinformatics/bty886

PMID:30376061

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6546130/

Abstract

MOTIVATION

Sequencing total RNA without poly-A selection enables us to obtain a transcriptomic profile of nascent RNAs undergoing transcription with co-transcriptional splicing. In general, the RNA-seq reads exhibit a sawtooth pattern in a gene, which is characterized by a monotonically decreasing gradient across introns in the 5'-3' direction, and by substantially higher levels of RNA-seq reads present in exonic regions. Such patterns result from the process of underlying transcription elongation by RNA polymerase II, which traverses the DNA strand in a 5'-3' direction as it performs a complex series of mRNA synthesis and processing. Therefore, data of sequenced total RNAs could be utilized to infer the rate of transcription elongation by solving the inverse problem.

RESULTS

Though solving the inverse problem in total RNA-seq has the great potential, statistical methods have not yet been fully developed. We demonstrate what extent the newly developed method can be useful. The objective is to reconstruct the spatial distribution of transcription elongation rates in a gene from a given noisy, sawtooth-like profile. It is necessary to recover the signal source of the elongation rates separately from several types of nuisance factors, such as unobserved modes of co-transcriptionally occurring mRNA splicing, which exert significant influences on the sawtooth shape. The present method was tested using published total RNA-seq data derived from mouse embryonic stem cells. We investigated the spatial characteristics of the estimated elongation rates, focusing especially on the relation to promoter-proximal pausing of RNA polymerase II, nucleosome occupancy and histone modification patterns.

AVAILABILITY AND IMPLEMENTATION

A C implementation of PolSter and sample data are available at https://github.com/yoshida-lab/PolSter.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

对总 RNA 进行非 poly-A 选择的测序使我们能够获得正在转录的新生 RNA 的转录组谱，同时进行共转录剪接。一般来说，RNA-seq reads 在基因中表现出锯齿状模式，其特征是在 5'到 3'方向上穿过内含子的梯度单调下降，并且在外显子区域中存在的 RNA-seq reads 水平显著更高。这种模式是由 RNA 聚合酶 II 进行的基础转录延伸过程产生的，它在执行一系列复杂的 mRNA 合成和加工过程中沿着 DNA 链从 5'到 3'方向移动。因此，测序总 RNA 的数据可用于通过解决反问题来推断转录延伸率。

结果

尽管在总 RNA-seq 中解决反问题具有很大的潜力，但统计方法尚未得到充分发展。我们展示了新开发的方法可以在多大程度上有用。目标是从给定的嘈杂锯齿状图谱中重建基因中转录延伸率的空间分布。有必要将延伸率的信号源与几种类型的干扰因素（如共转录发生的 mRNA 剪接的未观察到的模式）分开恢复，这些干扰因素对锯齿形状有重大影响。该方法使用来自小鼠胚胎干细胞的已发表的总 RNA-seq 数据进行了测试。我们研究了估计延伸率的空间特征，特别关注其与 RNA 聚合酶 II 启动子近端暂停、核小体占有率和组蛋白修饰模式的关系。

可用性和实现

PolSter 的 C 实现和示例数据可在 https://github.com/yoshida-lab/PolSter 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a243/6546130/45b7a4fc1aeb/bty886f1.jpg

相似文献

Statistical inference of the rate of RNA polymerase II elongation by total RNA sequencing.

Bioinformatics. 2019 Jun 1;35(11):1877-1884. doi: 10.1093/bioinformatics/bty886.

Quantification of co-transcriptional splicing from RNA-Seq data.

Methods. 2015 Sep 1;85:36-43. doi: 10.1016/j.ymeth.2015.04.024. Epub 2015 Apr 27.

Pause locally, splice globally.

Trends Cell Biol. 2011 Jun;21(6):328-35. doi: 10.1016/j.tcb.2011.03.002. Epub 2011 Apr 27.

PennDiff: detecting differential alternative splicing and transcription by RNA sequencing.

Bioinformatics. 2018 Jul 15;34(14):2384-2391. doi: 10.1093/bioinformatics/bty097.

The temporal landscape of recursive splicing during Pol II transcription elongation in human cells.

PLoS Genet. 2018 Aug 27;14(8):e1007579. doi: 10.1371/journal.pgen.1007579. eCollection 2018 Aug.

PTEN modulates gene transcription by redistributing genome-wide RNA polymerase II occupancy.

Hum Mol Genet. 2019 Sep 1;28(17):2826-2834. doi: 10.1093/hmg/ddz112.

Artificial RNA Polymerase II Elongation Complexes for Dissecting Co-transcriptional RNA Processing Events.

J Vis Exp. 2019 May 13(147). doi: 10.3791/59497.

Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.

Simultaneous measurement of genome-wide transcription elongation speeds and rates of RNA polymerase II transition into active elongation with 4sUDRB-seq.

Nat Protoc. 2015 Apr;10(4):605-18. doi: 10.1038/nprot.2015.035. Epub 2015 Mar 26.

RNA Pol II transcription model and interpretation of GRO-seq data.

J Math Biol. 2017 Jan;74(1-2):77-97. doi: 10.1007/s00285-016-1014-4. Epub 2016 May 3.

引用本文的文献

Geometrically encoded positioning of introns, intergenic segments, and exons in the human genome.

bioRxiv. 2025 May 29:2025.05.29.656862. doi: 10.1101/2025.05.29.656862.

RNA Polymerase II Activity Control of Gene Expression and Involvement in Disease.

J Mol Biol. 2025 Jan 1;437(1):168770. doi: 10.1016/j.jmb.2024.168770. Epub 2024 Aug 28.

Global impact of aberrant splicing on human gene expression levels.

bioRxiv. 2023 Oct 16:2023.09.13.557588. doi: 10.1101/2023.09.13.557588.

RNA polymerase II speed: a key player in controlling and adapting transcriptome composition.

EMBO J. 2021 Aug 2;40(15):e105740. doi: 10.15252/embj.2020105740. Epub 2021 Jul 13.

本文引用的文献

7SK-BAF axis controls pervasive transcription at enhancers.

Nat Struct Mol Biol. 2016 Mar;23(3):231-8. doi: 10.1038/nsmb.3176. Epub 2016 Feb 15.

groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data.

BMC Bioinformatics. 2015 Jul 16;16:222. doi: 10.1186/s12859-015-0656-3.

Recursive splicing in long vertebrate genes.

Nature. 2015 May 21;521(7552):371-375. doi: 10.1038/nature14466. Epub 2015 May 13.

Genome-wide identification of zero nucleotide recursive splicing in Drosophila.

Nature. 2015 May 21;521(7552):376-9. doi: 10.1038/nature14475. Epub 2015 May 13.

Getting up to speed with transcription elongation by RNA polymerase II.

Nat Rev Mol Cell Biol. 2015 Mar;16(3):167-77. doi: 10.1038/nrm3953. Epub 2015 Feb 18.

Chromatin modification by the RNA Polymerase II elongation complex.

Transcription. 2014;5(5):e988093. doi: 10.4161/21541264.2014.988093. Epub 2015 Jan 7.

Transcribing through the nucleosome.

Trends Biochem Sci. 2014 Dec;39(12):577-86. doi: 10.1016/j.tibs.2014.10.004.

Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons.

Elife. 2014 Apr 29;3:e02407. doi: 10.7554/eLife.02407.

Coupling mRNA processing with transcription in time and space.

Nat Rev Genet. 2014 Mar;15(3):163-75. doi: 10.1038/nrg3662. Epub 2014 Feb 11.

Metabolic labeling of newly transcribed RNA for high resolution gene expression profiling of RNA synthesis, processing and decay in cell culture.

J Vis Exp. 2013 Aug 8(78):50195. doi: 10.3791/50195.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于总 RNA 测序的 RNA 聚合酶 II 延伸速度的统计推断。

Statistical inference of the rate of RNA polymerase II elongation by total RNA sequencing.

机构信息

Department of Statistical Science, The Graduate University for Advanced Studies (SOKENDAI), Tachikawa, Japan.

Department of Statistical Modeling, The Institute of Statistical Mathematics, Research Organization of Information and Systems, Tachikawa, Japan.