Suppr超能文献

一种在 GRO-Seq 中检测新生 RNA 转录本的无注释算法。

An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1070-1081. doi: 10.1109/TCBB.2016.2520919. Epub 2016 Jan 26.

Abstract

We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process.

摘要

我们提出了一种快速而简单的算法,用于检测全局核 RNA 捕获测序(GRO-seq)中的新生 RNA 转录。GRO-seq 是一种相对较新的方案,可从活跃的聚合酶中捕获新生转录本,直接提供真实转录的读数。大多数传统的检测方法,如 RNA-seq,测量的是稳定状态的 RNA 水平,这些水平受到转录、转录后加工和 RNA 稳定性的影响。然而,GRO-seq 数据提出了独特的分析挑战,这些挑战才刚刚开始得到解决。在这里,我们描述了一种新的算法,Fast Read Stitcher(FStitch),它利用两种流行的机器学习技术,隐马尔可夫模型和逻辑回归,来对基因组的哪些区域进行转录进行分类。给定一个小的用户定义的训练集,我们的算法是准确的,对不同的读深具有鲁棒性,与注释无关,而且速度很快。在没有先验注释的情况下对 GRO-seq 数据进行分析,揭示了转录过程几个方面的惊人新见解。

相似文献

1
An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.
IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1070-1081. doi: 10.1109/TCBB.2016.2520919. Epub 2016 Jan 26.
2
Vespucci: a system for building annotated databases of nascent transcripts.
Nucleic Acids Res. 2014 Feb;42(4):2433-47. doi: 10.1093/nar/gkt1237. Epub 2013 Dec 4.
3
Global Run-on Sequencing (GRO-Seq).
Methods Mol Biol. 2021;2351:25-39. doi: 10.1007/978-1-0716-1597-3_2.
4
Global Run-On Sequencing (GRO-Seq).
Methods Mol Biol. 2017;1468:111-20. doi: 10.1007/978-1-4939-4035-6_9.
5
GRO-seq, A Tool for Identification of Transcripts Regulating Gene Expression.
Methods Mol Biol. 2017;1543:45-55. doi: 10.1007/978-1-4939-6716-2_3.
6
Computational Approaches for Mining GRO-Seq Data to Identify and Characterize Active Enhancers.
Methods Mol Biol. 2017;1468:121-38. doi: 10.1007/978-1-4939-4035-6_10.
7
Nascent RNA sequencing reveals distinct features in plant transcription.
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12316-12321. doi: 10.1073/pnas.1603217113. Epub 2016 Oct 11.
8
Protocol for affordable and efficient profiling of nascent RNAs in bread wheat using GRO-seq.
STAR Protoc. 2022 Sep 16;3(3):101657. doi: 10.1016/j.xpro.2022.101657. Epub 2022 Sep 2.
10
Multi-Genome Annotation with AUGUSTUS.
Methods Mol Biol. 2019;1962:139-160. doi: 10.1007/978-1-4939-9173-0_8.

引用本文的文献

1
eNRSA: a faster and more powerful approach for nascent transcriptome analysis.
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf071.
2
Atlas of nascent RNA transcripts reveals tissue-specific enhancer to gene linkages.
BMC Genomics. 2025 Apr 25;26(1):406. doi: 10.1186/s12864-025-11568-z.
3
LIET model: capturing the kinetics of RNA polymerase from loading to termination.
Nucleic Acids Res. 2025 Apr 10;53(7). doi: 10.1093/nar/gkaf246.
5
Atlas of nascent RNA transcripts reveals enhancer to gene linkages.
bioRxiv. 2023 Dec 8:2023.12.07.570626. doi: 10.1101/2023.12.07.570626.
7
Liver Transcriptome Dynamics During Hibernation Are Shaped by a Shifting Balance Between Transcription and RNA Stability.
Front Physiol. 2021 May 21;12:662132. doi: 10.3389/fphys.2021.662132. eCollection 2021.
8
PEPPRO: quality control and processing of nascent RNA profiling data.
Genome Biol. 2021 May 15;22(1):155. doi: 10.1186/s13059-021-02349-4.
9
Global Analyses to Identify Direct Transcriptional Targets of p53.
Methods Mol Biol. 2021;2267:19-56. doi: 10.1007/978-1-0716-1217-0_3.
10
Combining signal and sequence to detect RNA polymerase initiation in ATAC-seq data.
PLoS One. 2020 Apr 30;15(4):e0232332. doi: 10.1371/journal.pone.0232332. eCollection 2020.

本文引用的文献

2
Identification of active transcriptional regulatory elements from GRO-seq data.
Nat Methods. 2015 May;12(5):433-8. doi: 10.1038/nmeth.3329. Epub 2015 Mar 23.
4
UniProt: a hub for protein information.
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
5
Determination of in vivo RNA kinetics using RATE-seq.
RNA. 2014 Oct;20(10):1645-52. doi: 10.1261/rna.045104.114. Epub 2014 Aug 26.
7
Transcriptional enhancers: from properties to genome-wide predictions.
Nat Rev Genet. 2014 Apr;15(4):272-86. doi: 10.1038/nrg3682. Epub 2014 Mar 11.
8
Active enhancer positions can be accurately predicted from chromatin marks and collective sequence motif data.
BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S16. doi: 10.1186/1752-0509-7-S6-S16. Epub 2013 Dec 13.
9
Vespucci: a system for building annotated databases of nascent transcripts.
Nucleic Acids Res. 2014 Feb;42(4):2433-47. doi: 10.1093/nar/gkt1237. Epub 2013 Dec 4.
10
RefSeq: an update on mammalian reference sequences.
Nucleic Acids Res. 2014 Jan;42(Database issue):D756-63. doi: 10.1093/nar/gkt1114. Epub 2013 Nov 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验