Otto-Warburg-Laboratory, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany.
Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin 14195, Germany.
Nucleic Acids Res. 2021 May 7;49(8):4402-4420. doi: 10.1093/nar/gkab208.
Pausing of transcribing RNA polymerase is regulated and creates opportunities to control gene expression. Research in metazoans has so far mainly focused on RNA polymerase II (Pol II) promoter-proximal pausing leaving the pervasive nature of pausing and its regulatory potential in mammalian cells unclear. Here, we developed a pause detecting algorithm (PDA) for nucleotide-resolution occupancy data and a new native elongating transcript sequencing approach, termed nested NET-seq, that strongly reduces artifactual peaks commonly misinterpreted as pausing sites. Leveraging PDA and nested NET-seq reveal widespread genome-wide Pol II pausing at single-nucleotide resolution in human cells. Notably, the majority of Pol II pauses occur outside of promoter-proximal gene regions primarily along the gene-body of transcribed genes. Sequence analysis combined with machine learning modeling reveals DNA sequence properties underlying widespread transcriptional pausing including a new pause motif. Interestingly, key sequence determinants of RNA polymerase pausing are conserved between human cells and bacteria. These studies indicate pervasive sequence-induced transcriptional pausing in human cells and the knowledge of exact pause locations implies potential functional roles in gene expression.
暂停 RNA 聚合酶的转录受到调控,并为控制基因表达创造了机会。在后生动物中的研究迄今为止主要集中在 RNA 聚合酶 II(Pol II)启动子近端暂停上,而哺乳动物细胞中暂停的普遍存在及其调控潜力尚不清楚。在这里,我们开发了一种用于核苷酸分辨率占有率数据的暂停检测算法(PDA)和一种新的称为嵌套 NET-seq 的天然延伸转录测序方法,该方法可大大减少通常被误解为暂停位点的人为峰。利用 PDA 和嵌套 NET-seq,在人类细胞中以单核苷酸分辨率揭示了广泛的全基因组 Pol II 暂停。值得注意的是,大多数 Pol II 暂停发生在启动子近端基因区域之外,主要沿着转录基因的基因体。序列分析结合机器学习模型揭示了广泛的转录暂停的 DNA 序列特性,包括一个新的暂停基序。有趣的是,RNA 聚合酶暂停的关键序列决定因素在人类细胞和细菌之间是保守的。这些研究表明,人类细胞中存在普遍的序列诱导转录暂停,而确切的暂停位置的知识暗示了其在基因表达中的潜在功能作用。