转录延伸局部速率的DNA序列和表观基因组决定因素。

DNA-sequence and epigenomic determinants of local rates of transcription elongation.

作者信息

Liu Lingjie, Zhao Yixin, Siepel Adam

机构信息

Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

Graduate Program in Genetics, Stony Brook University, Stony Brook, NY.

出版信息

bioRxiv. 2023 Dec 23:2023.12.21.572932. doi: 10.1101/2023.12.21.572932.

DOI:10.1101/2023.12.21.572932

PMID:38187771

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10769381/

Abstract

Across all branches of life, transcription elongation is a crucial, regulated phase in gene expression. Many recent studies in eukaryotes have focused on the regulation of promoter-proximal pausing of RNA Polymerase II (Pol II), but rates of productive elongation also vary substantially throughout the gene body, both within and across genes. Here, we introduce a probabilistic model for systematically evaluating potential determinants of the local elongation rate based on nascent RNA sequencing (NRS) data. Our model is derived from a unified model for both the kinetics of Pol II movement along the DNA template and the generation of NRS read counts at steady state. It allows for a continuously variable elongation rate along the gene body, with the rate at each nucleotide defined by a generalized linear relationship with nearby genomic and epigenomic features. High-dimensional feature vectors are accommodated through a sparse-regression extension. We show with simulations that the model allows accurate detection of associated features and accurate prediction of local elongation rates. In an analysis of public PRO-seq and epigenomic data, we identify several features that are strongly associated with reductions in the local elongation rate, including DNA methylation, splice sites, RNA stem-loops, CTCF binding sites, and several histone marks, including H3K36me3 and H4K20me1. By contrast, low-complexity sequences and H3K79me2 marks are associated with increases in elongation rate. In an analysis of DNA -mers, we find that cytosine nucleotides are strongly associated with reductions in local elongation rate, particularly when preceded by guanines and followed by adenines or thymines. Increases in elongation rate are associated with thymines and A+T-rich -mers. These associations are generally shared across cell types, and by considering them our model is effective at predicting features of held-out PRO-seq data. Overall, our analysis is the first to permit genome-wide predictions of relative nucleotide-specific elongation rates based on complex sets of genomic and epigenomic covariates. We have made predictions available for the K562, CD14+, MCF-7, and HeLa-S3 cell types in a UCSC Genome Browser track.

摘要

在生命的所有分支中，转录延伸是基因表达中一个关键的、受调控的阶段。最近在真核生物中的许多研究都集中在RNA聚合酶II（Pol II）启动子近端暂停的调控上，但在整个基因体内，无论是在基因内部还是不同基因之间，有效延伸速率也存在很大差异。在这里，我们引入了一个概率模型，用于基于新生RNA测序（NRS）数据系统地评估局部延伸速率的潜在决定因素。我们的模型源自一个统一模型，该模型既描述了Pol II沿DNA模板移动的动力学，又描述了稳态下NRS读数计数的生成。它允许沿基因体的延伸速率连续变化，每个核苷酸处的速率由与附近基因组和表观基因组特征的广义线性关系定义。通过稀疏回归扩展来处理高维特征向量。我们通过模拟表明，该模型能够准确检测相关特征并准确预测局部延伸速率。在对公开的PRO-seq和表观基因组数据的分析中，我们确定了几个与局部延伸速率降低密切相关的特征，包括DNA甲基化、剪接位点、RNA茎环、CTCF结合位点以及几种组蛋白标记，包括H3K36me3和H4K20me1。相比之下，低复杂性序列和H3K79me2标记与延伸速率增加有关。在对DNA -聚体的分析中，我们发现胞嘧啶核苷酸与局部延伸速率降低密切相关，特别是当它前面是鸟嘌呤且后面是腺嘌呤或胸腺嘧啶时。延伸速率增加与胸腺嘧啶和富含A + T的聚体有关。这些关联通常在不同细胞类型中都存在，并且通过考虑这些关联，我们的模型能够有效地预测留存的PRO-seq数据的特征。总体而言，我们的分析首次允许基于复杂的基因组和表观基因组协变量集对全基因组相对核苷酸特异性延伸速率进行预测。我们已在UCSC基因组浏览器轨道中为K562、CD14 +、MCF-7和HeLa-S3细胞类型提供了预测结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be8b/10769381/ddf37844b9f6/nihpp-2023.12.21.572932v1-f0001.jpg

相似文献

DNA-sequence and epigenomic determinants of local rates of transcription elongation.转录延伸局部速率的DNA序列和表观基因组决定因素。

bioRxiv. 2023 Dec 23:2023.12.21.572932. doi: 10.1101/2023.12.21.572932.

A machine learning-based framework for modeling transcription elongation.基于机器学习的转录延伸建模框架。

Proc Natl Acad Sci U S A. 2021 Feb 9;118(6). doi: 10.1073/pnas.2007450118.

A dual role for the histone methyltransferase PR-SET7/SETD8 and histone H4 lysine 20 monomethylation in the local regulation of RNA polymerase II pausing.组蛋白甲基转移酶 PR-SET7/SETD8 及其组蛋白 H4 赖氨酸 20 单甲基化在调控 RNA 聚合酶 II 暂停中的双重作用。

J Biol Chem. 2014 Mar 14;289(11):7425-37. doi: 10.1074/jbc.M113.520783. Epub 2014 Jan 23.

RNA Polymerase II Promoter-Proximal Pausing and Release to Elongation Are Key Steps Regulating Herpes Simplex Virus 1 Transcription.RNA 聚合酶 II 启动子近端暂停和释放到延伸是调节单纯疱疹病毒 1 转录的关键步骤。

J Virol. 2020 Feb 14;94(5). doi: 10.1128/JVI.02035-19.

Rate of elongation by RNA polymerase II is associated with specific gene features and epigenetic modifications.RNA 聚合酶 II 的延伸速度与特定基因特征和表观遗传修饰有关。

Genome Res. 2014 Jun;24(6):896-905. doi: 10.1101/gr.171405.113. Epub 2014 Apr 8.

Cotranscriptional histone H2B monoubiquitylation is tightly coupled with RNA polymerase II elongation rate.共转录组蛋白H2B单泛素化与RNA聚合酶II延伸率紧密相关。

Genome Res. 2014 Oct;24(10):1572-83. doi: 10.1101/gr.176487.114. Epub 2014 Jul 21.

Polycomb repressive complex 2 epigenomic signature defines age-associated hypermethylation and gene expression changes.多梳抑制复合物2表观基因组特征定义了与年龄相关的高甲基化和基因表达变化。

Epigenetics. 2015;10(6):484-95. doi: 10.1080/15592294.2015.1040619. Epub 2015 Apr 16.

Characterizing RNA stability genome-wide through combined analysis of PRO-seq and RNA-seq data.通过PRO-seq和RNA-seq数据的联合分析对全基因组RNA稳定性进行表征。

BMC Biol. 2021 Feb 15;19(1):30. doi: 10.1186/s12915-021-00949-x.

Genome-wide dynamics of Pol II elongation and its interplay with promoter proximal pausing, chromatin, and exons.RNA聚合酶II延伸的全基因组动力学及其与启动子近端暂停、染色质和外显子的相互作用。

Elife. 2014 Apr 29;3:e02407. doi: 10.7554/eLife.02407.

Using ChIP-chip and ChIP-seq to study the regulation of gene expression: genome-wide localization studies reveal widespread regulation of transcription elongation.利用染色质免疫沉淀芯片技术（ChIP-chip）和染色质免疫沉淀测序技术（ChIP-seq）研究基因表达调控：全基因组定位研究揭示转录延伸的广泛调控。

Methods. 2009 Aug;48(4):398-408. doi: 10.1016/j.ymeth.2009.02.024. Epub 2009 Mar 9.

本文引用的文献

Model-based characterization of the equilibrium dynamics of transcription initiation and promoter-proximal pausing in human cells.基于模型的人类细胞转录起始和启动子近端暂停的平衡动力学特征分析。

Nucleic Acids Res. 2023 Nov 27;51(21):e106. doi: 10.1093/nar/gkad843.

U1 snRNP increases RNA Pol II elongation rate to enable synthesis of long genes.U1 snRNP 提高 RNA Pol II 延伸速度，从而促进长基因的合成。

Mol Cell. 2023 Apr 20;83(8):1264-1279.e10. doi: 10.1016/j.molcel.2023.03.002. Epub 2023 Mar 24.

Analysis of estrogen-regulated enhancer RNAs identifies a functional motif required for enhancer assembly and gene expression.雌激素调控增强子 RNA 的分析鉴定出一个功能性基序，该基序对于增强子组装和基因表达是必需的。

Cell Rep. 2022 Jun 14;39(11):110944. doi: 10.1016/j.celrep.2022.110944.

Screening thousands of transcribed coding and non-coding regions reveals sequence determinants of RNA polymerase II elongation potential.筛选成千上万的转录编码和非编码区域，揭示了 RNA 聚合酶 II 延伸潜力的序列决定因素。

Nat Struct Mol Biol. 2022 Jun;29(6):613-620. doi: 10.1038/s41594-022-00785-9. Epub 2022 Jun 9.

Transcription elongation is finely tuned by dozens of regulatory factors.转录延伸由数十个调节因子精细调控。

Elife. 2022 May 16;11:e78944. doi: 10.7554/eLife.78944.

From telomere to telomere: The transcriptional and epigenetic state of human repeat elements.从端粒到端粒：人类重复元件的转录和表观遗传状态。

Science. 2022 Apr;376(6588):eabk3112. doi: 10.1126/science.abk3112. Epub 2022 Apr 1.

Dynamic control of chromatin-associated mA methylation regulates nascent RNA synthesis.动态控制染色质相关 mA 甲基化调节新生 RNA 合成。

Mol Cell. 2022 Mar 17;82(6):1156-1168.e7. doi: 10.1016/j.molcel.2022.02.006. Epub 2022 Feb 25.

Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells.去卷积表达新生 RNA 测序数据（DENR）突出了人类细胞中前 RNA 异构体的多样性。

Bioinformatics. 2021 Dec 11;37(24):4727-4736. doi: 10.1093/bioinformatics/btab582.

RNA polymerase II speed: a key player in controlling and adapting transcriptome composition.RNA 聚合酶 II 速度：控制和适应转录组组成的关键因素。

EMBO J. 2021 Aug 2;40(15):e105740. doi: 10.15252/embj.2020105740. Epub 2021 Jul 13.

Conserved DNA sequence features underlie pervasive RNA polymerase pausing.保守的 DNA 序列特征是 RNA 聚合酶普遍暂停的基础。

Nucleic Acids Res. 2021 May 7;49(8):4402-4420. doi: 10.1093/nar/gkab208.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

转录延伸局部速率的DNA序列和表观基因组决定因素。

DNA-sequence and epigenomic determinants of local rates of transcription elongation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献