• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过RNA序列数据的综合分析进行基因组引导的转录本组装。

Genome-guided transcript assembly by integrative analysis of RNA sequence data.

作者信息

Boley Nathan, Stoiber Marcus H, Booth Benjamin W, Wan Kenneth H, Hoskins Roger A, Bickel Peter J, Celniker Susan E, Brown James B

机构信息

Department of Biostatistics, University of California at Berkeley, Berkeley, California, USA.

Department of Genome Dynamics, Lawrence Berkeley National Laboratory, Berkeley, California, USA.

出版信息

Nat Biotechnol. 2014 Apr;32(4):341-6. doi: 10.1038/nbt.2850. Epub 2014 Mar 16.

DOI:10.1038/nbt.2850
PMID:24633242
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4037530/
Abstract

The identification of full length transcripts entirely from short-read RNA sequencing data (RNA-seq) remains a challenge in the annotation of genomes. Here we describe an automated pipeline for genome annotation that integrates RNA-seq and gene-boundary data sets, which we call Generalized RNA Integration Tool, or GRIT. Applying GRIT to Drosophila melanogaster short-read RNA-seq, cap analysis of gene expression (CAGE) and poly(A)-site-seq data collected for the modENCODE project, we recovered the vast majority of previously annotated transcripts and doubled the total number of transcripts cataloged. We found that 20% of protein coding genes encode multiple protein-localization signals and that, in 20-d-old adult fly heads, genes with multiple polyadenylation sites are more common than genes with alternative splicing or alternative promoters. GRIT demonstrates 30% higher precision and recall than the most widely used transcript assembly tools. GRIT will facilitate the automated generation of high-quality genome annotations without the need for extensive manual annotation.

摘要

仅从短读长RNA测序数据(RNA-seq)中识别全长转录本,仍然是基因组注释中的一项挑战。在此,我们描述了一种用于基因组注释的自动化流程,该流程整合了RNA-seq和基因边界数据集,我们将其称为通用RNA整合工具(Generalized RNA Integration Tool,简称GRIT)。将GRIT应用于为modENCODE项目收集的黑腹果蝇短读长RNA-seq、基因表达的帽分析(CAGE)和聚腺苷酸化位点测序数据,我们找回了绝大多数先前注释的转录本,并使编目转录本的总数增加了一倍。我们发现,20%的蛋白质编码基因编码多个蛋白质定位信号,并且在20日龄成年果蝇头部,具有多个聚腺苷酸化位点的基因比具有可变剪接或可变启动子的基因更常见。GRIT的精确率和召回率比使用最广泛的转录本组装工具高30%。GRIT将有助于自动生成高质量的基因组注释,而无需大量人工注释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/e0b04f2098c8/nihms566373f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/d16c1619fa57/nihms566373f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/ce28b4d5e6ab/nihms566373f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/e0b04f2098c8/nihms566373f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/d16c1619fa57/nihms566373f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/ce28b4d5e6ab/nihms566373f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6985/4037530/e0b04f2098c8/nihms566373f3.jpg

相似文献

1
Genome-guided transcript assembly by integrative analysis of RNA sequence data.通过RNA序列数据的综合分析进行基因组引导的转录本组装。
Nat Biotechnol. 2014 Apr;32(4):341-6. doi: 10.1038/nbt.2850. Epub 2014 Mar 16.
2
A robust (re-)annotation approach to generate unbiased mapping references for RNA-seq-based analyses of differential expression across closely related species.一种强大的(重新)注释方法,用于为基于RNA测序的密切相关物种间差异表达分析生成无偏映射参考。
BMC Genomics. 2016 May 24;17:392. doi: 10.1186/s12864-016-2646-x.
3
Enhancing Structural Annotation of Yeast Genomes with RNA-Seq Data.利用RNA测序数据增强酵母基因组的结构注释
Methods Mol Biol. 2016;1361:41-56. doi: 10.1007/978-1-4939-3079-1_2.
4
Estimates of allele-specific expression in Drosophila with a single genome sequence and RNA-seq data.使用单个基因组序列和 RNA-seq 数据估计果蝇中的等位基因特异性表达。
Bioinformatics. 2014 Sep 15;30(18):2603-10. doi: 10.1093/bioinformatics/btu342. Epub 2014 May 19.
5
CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.CodingQuarry:利用RNA测序转录本对真菌基因组进行高精度隐马尔可夫模型基因预测。
BMC Genomics. 2015 Mar 11;16(1):170. doi: 10.1186/s12864-015-1344-4.
6
OGS2: genome re-annotation of the jewel wasp Nasonia vitripennis.OGS2:丽蝇蛹集金小蜂基因组的重新注释
BMC Genomics. 2016 Aug 25;17(1):678. doi: 10.1186/s12864-016-2886-9.
7
Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.通过对双倍单倍体纯合子的 RNA-Seq 分析,高效组装和注释鲶鱼转录组。
BMC Genomics. 2012 Nov 5;13:595. doi: 10.1186/1471-2164-13-595.
8
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.利用全长异构体测序和短读长测序的从头组装对高度多倍体甘蔗基因组的复杂转录组进行的一项调查。
BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.
9
Optimizing de novo assembly of short-read RNA-seq data for phylogenomics.优化短读 RNA-seq 数据的从头组装用于系统发生基因组学。
BMC Genomics. 2013 May 14;14:328. doi: 10.1186/1471-2164-14-328.
10
Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.草履虫基因组学的改进方法与资源:转录单元、基因注释与基因表达
BMC Genomics. 2017 Jun 26;18(1):483. doi: 10.1186/s12864-017-3887-z.

引用本文的文献

1
The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程
Res Sq. 2023 Jul 19:rs.3.rs-3111932. doi: 10.21203/rs.3.rs-3111932/v1.
2
The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程
bioRxiv. 2023 Apr 6:2023.04.04.535623. doi: 10.1101/2023.04.04.535623.
3
The Diagnostic and Therapeutic Role of Circular RNA HIPK3 in Human Diseases.环状RNA HIPK3在人类疾病中的诊断和治疗作用

本文引用的文献

1
Diversity and dynamics of the Drosophila transcriptome.果蝇转录组的多样性与动态变化
Nature. 2014 Aug 28;512(7515):393-9. doi: 10.1038/nature12962.
2
A single-molecule long-read survey of the human transcriptome.人类转录组的单分子长读长测序研究。
Nat Biotechnol. 2013 Nov;31(11):1009-14. doi: 10.1038/nbt.2705. Epub 2013 Oct 13.
3
DNMT1-interacting RNAs block gene-specific DNA methylation.DNMT1 相互作用 RNA 阻断基因特异性 DNA 甲基化。
Diagnostics (Basel). 2022 Oct 12;12(10):2469. doi: 10.3390/diagnostics12102469.
4
Diverse Roles and Therapeutic Potentials of Circular RNAs in Urological Cancers.环状RNA在泌尿系统癌症中的多样作用及治疗潜力
Front Mol Biosci. 2021 Nov 19;8:761698. doi: 10.3389/fmolb.2021.761698. eCollection 2021.
5
Cap analysis of gene expression (CAGE) and noncoding regulatory elements.基因表达的帽分析(CAGE)和非编码调控元件。
Semin Immunopathol. 2022 Jan;44(1):127-136. doi: 10.1007/s00281-021-00886-5. Epub 2021 Sep 1.
6
PacBio Iso-Seq Improves the Rainbow Trout Genome Annotation and Identifies Alternative Splicing Associated With Economically Important Phenotypes.PacBio全长转录组测序技术改进虹鳟鱼基因组注释并鉴定与经济重要性状相关的可变剪接
Front Genet. 2021 Jul 15;12:683408. doi: 10.3389/fgene.2021.683408. eCollection 2021.
7
Full-length annotation with multistrategy RNA-seq uncovers transcriptional regulation of lncRNAs in cotton.多策略RNA测序的全长注释揭示了棉花中lncRNA的转录调控。
Plant Physiol. 2021 Feb 25;185(1):179-195. doi: 10.1093/plphys/kiaa003.
8
High-Resolution Mapping of Transcription Initiation in the Asexual Stages of .高分辨率转录起始图谱绘制在无性阶段的 。
Front Cell Infect Microbiol. 2021 Jan 20;10:617998. doi: 10.3389/fcimb.2020.617998. eCollection 2020.
9
Compacta: a fast contig clustering tool for de novo assembled transcriptomes.Compacta:一种用于从头组装转录组的快速重叠群聚类工具。
BMC Genomics. 2020 Feb 11;21(1):148. doi: 10.1186/s12864-020-6528-x.
10
Multi-strategic RNA-seq analysis reveals a high-resolution transcriptional landscape in cotton.多策略 RNA-seq 分析揭示了棉花中高分辨率的转录景观。
Nat Commun. 2019 Oct 17;10(1):4714. doi: 10.1038/s41467-019-12575-x.
Nature. 2013 Nov 21;503(7476):371-6. doi: 10.1038/nature12598. Epub 2013 Oct 9.
4
FlyBase: improvements to the bibliography.FlyBase:文献目录的改进。
Nucleic Acids Res. 2013 Jan;41(Database issue):D751-7. doi: 10.1093/nar/gks1024. Epub 2012 Nov 3.
5
Alternative transcription start site selection leads to large differences in translation activity in yeast.选择性转录起始位点选择导致酵母中转译活性存在巨大差异。
RNA. 2012 Dec;18(12):2299-305. doi: 10.1261/rna.035865.112. Epub 2012 Oct 25.
6
GENCODE: the reference human genome annotation for The ENCODE Project.GENCODE:ENCODE 项目的人类参考基因组注释。
Genome Res. 2012 Sep;22(9):1760-74. doi: 10.1101/gr.135350.111.
7
Incorporating RNA-seq data into the zebrafish Ensembl genebuild.将 RNA-seq 数据纳入斑马鱼 Ensembl 基因构建
Genome Res. 2012 Oct;22(10):2067-78. doi: 10.1101/gr.137901.112. Epub 2012 Jul 12.
8
Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis.基于转录组分析的海胆 Strongylocentrotus purpuratus 的基因结构。
Genome Res. 2012 Oct;22(10):2079-87. doi: 10.1101/gr.139170.112. Epub 2012 Jun 18.
9
Global patterns of tissue-specific alternative polyadenylation in Drosophila.果蝇组织特异性可变多聚腺苷酸化的全球模式。
Cell Rep. 2012 Mar 29;1(3):277-89. doi: 10.1016/j.celrep.2012.01.001.
10
Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels.绿洲:跨越表达水平动态范围的稳健从头 RNA-seq 组装。
Bioinformatics. 2012 Apr 15;28(8):1086-92. doi: 10.1093/bioinformatics/bts094. Epub 2012 Feb 24.