Suppr超能文献

基于机器学习的转录延伸建模框架。

A machine learning-based framework for modeling transcription elongation.

机构信息

Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.

Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China.

出版信息

Proc Natl Acad Sci U S A. 2021 Feb 9;118(6). doi: 10.1073/pnas.2007450118.

Abstract

RNA polymerase II (Pol II) generally pauses at certain positions along gene bodies, thereby interrupting the transcription elongation process, which is often coupled with various important biological functions, such as precursor mRNA splicing and gene expression regulation. Characterizing the transcriptional elongation dynamics can thus help us understand many essential biological processes in eukaryotic cells. However, experimentally measuring Pol II elongation rates is generally time and resource consuming. We developed PEPMAN (polymerase II elongation pausing modeling through attention-based deep neural network), a deep learning-based model that accurately predicts Pol II pausing sites based on the native elongating transcript sequencing (NET-seq) data. Through fully taking advantage of the attention mechanism, PEPMAN is able to decipher important sequence features underlying Pol II pausing. More importantly, we demonstrated that the analyses of the PEPMAN-predicted results around various types of alternative splicing sites can provide useful clues into understanding the cotranscriptional splicing events. In addition, associating the PEPMAN prediction results with different epigenetic features can help reveal important factors related to the transcription elongation process. All these results demonstrated that PEPMAN can provide a useful and effective tool for modeling transcription elongation and understanding the related biological factors from available high-throughput sequencing data.

摘要

RNA 聚合酶 II(Pol II)通常在基因体的某些位置暂停,从而中断转录延伸过程,这通常与各种重要的生物学功能相关,如前体 mRNA 的剪接和基因表达调控。因此,描述转录延伸动力学可以帮助我们理解真核细胞中的许多基本生物学过程。然而,实验测量 Pol II 延伸率通常既耗时又耗资源。我们开发了 PEPMAN(基于注意力的深度学习神经网络的 Pol II 延伸暂停建模),这是一种基于 native elongating transcript sequencing(NET-seq)数据的、可以准确预测 Pol II 暂停位点的深度学习模型。通过充分利用注意力机制,PEPMAN 能够破译 Pol II 暂停背后的重要序列特征。更重要的是,我们证明了对各种类型的可变剪接位点周围的 PEPMAN 预测结果进行分析可以为理解共转录剪接事件提供有用的线索。此外,将 PEPMAN 预测结果与不同的表观遗传特征相关联有助于揭示与转录延伸过程相关的重要因素。所有这些结果表明,PEPMAN 可以为从现有高通量测序数据中模拟转录延伸和理解相关生物学因素提供有用且有效的工具。

相似文献

8
Efficient RNA polymerase II pause release requires U2 snRNP function.高效的 RNA 聚合酶 II 暂停释放需要 U2 snRNP 功能。
Mol Cell. 2021 May 6;81(9):1920-1934.e9. doi: 10.1016/j.molcel.2021.02.016. Epub 2021 Mar 8.

引用本文的文献

6
Stochastic modeling of the mRNA life process: A generalized master equation.mRNA 生命过程的随机建模:广义主方程。
Biophys J. 2023 Oct 17;122(20):4023-4041. doi: 10.1016/j.bpj.2023.08.024. Epub 2023 Aug 30.
10
The histone chaperone FACT: a guardian of chromatin structure integrity.组蛋白伴侣FACT:染色质结构完整性的守护者。
Transcription. 2022 Feb-Jun;13(1-3):16-38. doi: 10.1080/21541264.2022.2069995. Epub 2022 Apr 29.

本文引用的文献

4
Ensembl 2019.Ensembl 2019.
Nucleic Acids Res. 2019 Jan 8;47(D1):D745-D751. doi: 10.1093/nar/gky1113.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验