维斯帕西：一种用于构建新生转录本注释数据库的系统。

Vespucci: a system for building annotated databases of nascent transcripts.

机构信息

Department of Cellular and Molecular Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA, Department of Bioinformatics and Systems Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA, San Diego Center for Systems Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0375, USA, A.I. Virtanen Institute, Department of Biotechnology and Molecular Medicine, University of Eastern Finland, P.O. Box 1627, 70120 Kuopio, Finland, Institute for Genomic Medicine and Scripps Institution of Oceanography, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA and Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0651, USA.

出版信息

Nucleic Acids Res. 2014 Feb;42(4):2433-47. doi: 10.1093/nar/gkt1237. Epub 2013 Dec 4.

DOI:10.1093/nar/gkt1237

PMID:24304890

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3936758/

Abstract

Global run-on sequencing (GRO-seq) is a recent addition to the series of high-throughput sequencing methods that enables new insights into transcriptional dynamics within a cell. However, GRO-sequencing presents new algorithmic challenges, as existing analysis platforms for ChIP-seq and RNA-seq do not address the unique problem of identifying transcriptional units de novo from short reads located all across the genome. Here, we present a novel algorithm for de novo transcript identification from GRO-sequencing data, along with a system that determines transcript regions, stores them in a relational database and associates them with known reference annotations. We use this method to analyze GRO-sequencing data from primary mouse macrophages and derive novel quantitative insights into the extent and characteristics of non-coding transcription in mammalian cells. In doing so, we demonstrate that Vespucci expands existing annotations for mRNAs and lincRNAs by defining the primary transcript beyond the polyadenylation site. In addition, Vespucci generates assemblies for un-annotated non-coding RNAs such as those transcribed from enhancer-like elements. Vespucci thereby provides a robust system for defining, storing and analyzing diverse classes of primary RNA transcripts that are of increasing biological interest.

摘要

全球延伸测序 (GRO-seq) 是高通量测序方法系列中的最新成员，它使人们能够深入了解细胞内的转录动态。然而，GRO-seq 提出了新的算法挑战，因为现有的 ChIP-seq 和 RNA-seq 分析平台并不能解决从位于整个基因组的短读段中从头鉴定转录单元的独特问题。在这里，我们提出了一种从 GRO-seq 数据中从头鉴定转录本的新算法，以及一种确定转录本区域的系统，将它们存储在关系数据库中，并将它们与已知的参考注释相关联。我们使用这种方法来分析来自原代小鼠巨噬细胞的 GRO-seq 数据，并深入了解哺乳动物细胞中非编码转录的程度和特征。通过这种方式，我们证明 Vespucci 通过定义多聚腺苷酸化位点之外的初级转录本，扩展了 mRNAs 和 lincRNAs 的现有注释。此外，Vespucci 还为未注释的非编码 RNA 生成组装，例如从增强子样元件转录的 RNA。因此，Vespucci 为定义、存储和分析越来越具有生物学意义的不同类型的初级 RNA 转录本提供了一个强大的系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14b7/3936758/71b231a87ffd/gkt1237f1p.jpg

相似文献

Vespucci: a system for building annotated databases of nascent transcripts.

Nucleic Acids Res. 2014 Feb;42(4):2433-47. doi: 10.1093/nar/gkt1237. Epub 2013 Dec 4.

Global Run-On Sequencing (GRO-Seq).

Methods Mol Biol. 2017;1468:111-20. doi: 10.1007/978-1-4939-4035-6_9.

An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.

IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1070-1081. doi: 10.1109/TCBB.2016.2520919. Epub 2016 Jan 26.

groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data.

BMC Bioinformatics. 2015 Jul 16;16:222. doi: 10.1186/s12859-015-0656-3.

Global Run-on Sequencing (GRO-Seq).

Methods Mol Biol. 2021;2351:25-39. doi: 10.1007/978-1-0716-1597-3_2.

Knowledge-based reconstruction of mRNA transcripts with short sequencing reads for transcriptome research.

PLoS One. 2012;7(2):e31440. doi: 10.1371/journal.pone.0031440. Epub 2012 Feb 1.

A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.

BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.

Detection and Analysis of Long Noncoding RNAs.

Methods Enzymol. 2016;573:421-44. doi: 10.1016/bs.mie.2016.03.010. Epub 2016 Mar 28.

Nascent RNA sequencing reveals distinct features in plant transcription.

Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12316-12321. doi: 10.1073/pnas.1603217113. Epub 2016 Oct 11.

Analysis of RNA decay factor mediated RNA stability contributions on RNA abundance.

BMC Genomics. 2015 Mar 6;16(1):154. doi: 10.1186/s12864-015-1358-y.

引用本文的文献

eNRSA: a faster and more powerful approach for nascent transcriptome analysis.

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf071.

TF Profiler: a transcription factor inference method that broadly measures transcription factor activity and identifies mechanistically distinct networks.

Genome Biol. 2025 Apr 9;26(1):92. doi: 10.1186/s13059-025-03545-2.

Atlas of nascent RNA transcripts reveals enhancer to gene linkages.

bioRxiv. 2023 Dec 8:2023.12.07.570626. doi: 10.1101/2023.12.07.570626.

PEPPRO: quality control and processing of nascent RNA profiling data.

Genome Biol. 2021 May 15;22(1):155. doi: 10.1186/s13059-021-02349-4.

Global Analyses to Identify Direct Transcriptional Targets of p53.

Methods Mol Biol. 2021;2267:19-56. doi: 10.1007/978-1-0716-1217-0_3.

Nascent RNA sequencing analysis provides insights into enhancer-mediated gene regulation.

BMC Genomics. 2018 Aug 23;19(1):633. doi: 10.1186/s12864-018-5016-z.

Long Non-Coding RNAs: A Novel Paradigm for Toxicology.

Toxicol Sci. 2017 Jan;155(1):3-21. doi: 10.1093/toxsci/kfw203. Epub 2016 Nov 17.

A generative model for the behavior of RNA polymerase.

Bioinformatics. 2017 Jan 15;33(2):227-234. doi: 10.1093/bioinformatics/btw599. Epub 2016 Sep 23.

Affinity and dose of TCR engagement yield proportional enhancer and gene activity in CD4+ T cells.

Elife. 2016 Jul 4;5:e10134. doi: 10.7554/eLife.10134.

RNA Pol II transcription model and interpretation of GRO-seq data.

J Math Biol. 2017 Jan;74(1-2):77-97. doi: 10.1007/s00285-016-1014-4. Epub 2016 May 3.

本文引用的文献

Remodeling of the enhancer landscape during macrophage activation is coupled to enhancer transcription.

Mol Cell. 2013 Aug 8;51(3):310-25. doi: 10.1016/j.molcel.2013.07.010.

Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription.

Nature. 2013 Jun 27;498(7455):511-5. doi: 10.1038/nature12209. Epub 2013 Jun 2.

Functional roles of enhancer RNAs for oestrogen-dependent transcriptional activation.

Nature. 2013 Jun 27;498(7455):516-20. doi: 10.1038/nature12210. Epub 2013 Jun 2.

SR proteins collaborate with 7SK and promoter-associated nascent RNA to release paused polymerase.

Cell. 2013 May 9;153(4):855-68. doi: 10.1016/j.cell.2013.04.028.

Activating RNAs associate with Mediator to enhance chromatin architecture and transcription.

Nature. 2013 Feb 28;494(7438):497-501. doi: 10.1038/nature11884. Epub 2013 Feb 17.

Retroelements in human disease.

Gene. 2013 Apr 15;518(2):231-41. doi: 10.1016/j.gene.2013.01.008. Epub 2013 Jan 17.

Transcriptome-wide expansion of non-coding regulatory switches: evidence from co-occurrence of Alu exonization, antisense and editing.

Nucleic Acids Res. 2013 Feb 1;41(4):2121-37. doi: 10.1093/nar/gks1457. Epub 2013 Jan 8.

eRNAs are required for p53-dependent enhancer activity and gene transcription.

Mol Cell. 2013 Feb 7;49(3):524-35. doi: 10.1016/j.molcel.2012.11.021. Epub 2012 Dec 27.

Landscape of transcription in human cells.

Nature. 2012 Sep 6;489(7414):101-8. doi: 10.1038/nature11233.

An integrated encyclopedia of DNA elements in the human genome.

Nature. 2012 Sep 6;489(7414):57-74. doi: 10.1038/nature11247.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

维斯帕西：一种用于构建新生转录本注释数据库的系统。

Vespucci: a system for building annotated databases of nascent transcripts.

机构信息

出版信息

Nucleic Acids Res. 2014 Feb;42(4):2433-47. doi: 10.1093/nar/gkt1237. Epub 2013 Dec 4.

DOI:10.1093/nar/gkt1237

PMID:24304890

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3936758/

Abstract

摘要

维斯帕西：一种用于构建新生转录本注释数据库的系统。

Vespucci: a system for building annotated databases of nascent transcripts.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

维斯帕西：一种用于构建新生转录本注释数据库的系统。

Vespucci: a system for building annotated databases of nascent transcripts.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献