• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用本地和共线性映射的cDNA比对来改进从头基因预测。

Using native and syntenically mapped cDNA alignments to improve de novo gene finding.

作者信息

Stanke Mario, Diekhans Mark, Baertsch Robert, Haussler David

机构信息

Center for Biomolecular Science and Engineering, University of California Santa Cruz (UCSC), Santa Cruz, CA 95064, USA.

出版信息

Bioinformatics. 2008 Mar 1;24(5):637-44. doi: 10.1093/bioinformatics/btn013. Epub 2008 Jan 24.

DOI:10.1093/bioinformatics/btn013
PMID:18218656
Abstract

MOTIVATION

Computational annotation of protein coding genes in genomic DNA is a widely used and essential tool for analyzing newly sequenced genomes. However, current methods suffer from inaccuracy and do poorly with certain types of genes. Including additional sources of evidence of the existence and structure of genes can improve the quality of gene predictions. For many eukaryotic genomes, expressed sequence tags (ESTs) are available as evidence for genes. Related genomes that have been sequenced, annotated, and aligned to the target genome provide evidence of existence and structure of genes.

RESULTS

We incorporate several different evidence sources into the gene finder AUGUSTUS. The sources of evidence are gene and transcript annotations from related species syntenically mapped to the target genome using TransMap, evolutionary conservation of DNA, mRNA and ESTs of the target species, and retroposed genes. The predictions include alternative splice variants where evidence supports it. Using only ESTs we were able to correctly predict at least one splice form exactly correct in 57% of human genes. Also using evidence from other species and human mRNAs, this number rises to 77%. Syntenic mapping is well-suited to annotate genomes closely related to genomes that are already annotated or for which extensive transcript evidence is available. Native cDNA evidence is most helpful when the alignments are used as compound information rather than independent positionwise information.

AVAILABILITY

AUGUSTUS is open source and available at http://augustus.gobics.de. The gene predictions for human can be browsed and downloaded at the UCSC Genome Browser (http://genome.ucsc.edu).

摘要

动机

对基因组DNA中的蛋白质编码基因进行计算注释是分析新测序基因组时广泛使用的重要工具。然而,当前方法存在不准确的问题,并且对某些类型的基因效果不佳。纳入基因存在和结构的其他证据来源可以提高基因预测的质量。对于许多真核生物基因组,表达序列标签(EST)可作为基因的证据。已测序、注释并与目标基因组比对的相关基因组提供了基因存在和结构的证据。

结果

我们将几种不同的证据来源整合到基因预测工具AUGUSTUS中。证据来源包括使用TransMap通过共线性映射到目标基因组的相关物种的基因和转录本注释、目标物种DNA、mRNA和EST的进化保守性以及反转座基因。预测结果包括在有证据支持的情况下的可变剪接变体。仅使用EST,我们能够在57%的人类基因中正确预测至少一种完全正确的剪接形式。同时使用来自其他物种和人类mRNA的证据,这一数字上升到77%。共线性映射非常适合注释与已注释基因组密切相关或有大量转录本证据的基因组。当比对用作复合信息而非独立的逐位置信息时,天然cDNA证据最有帮助。

可用性

AUGUSTUS是开源的,可从http://augustus.gobics.de获取。人类基因预测结果可在UCSC基因组浏览器(http://genome.ucsc.edu)上浏览和下载。

相似文献

1
Using native and syntenically mapped cDNA alignments to improve de novo gene finding.利用本地和共线性映射的cDNA比对来改进从头基因预测。
Bioinformatics. 2008 Mar 1;24(5):637-44. doi: 10.1093/bioinformatics/btn013. Epub 2008 Jan 24.
2
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
3
AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome.EGASP中的AUGUSTUS:利用EST、蛋白质和基因组比对改进人类基因组中的基因预测
Genome Biol. 2006;7 Suppl 1(Suppl 1):S11.1-8. doi: 10.1186/gb-2006-7-s1-s11. Epub 2006 Aug 7.
4
Using ESTs to improve the accuracy of de novo gene prediction.利用表达序列标签提高从头基因预测的准确性。
BMC Bioinformatics. 2006 Jul 3;7:327. doi: 10.1186/1471-2105-7-327.
5
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies.使用最大转录本比对组装改进拟南芥基因组注释
Nucleic Acids Res. 2003 Oct 1;31(19):5654-66. doi: 10.1093/nar/gkg770.
6
Characterization of 954 bovine full-CDS cDNA sequences.954条牛全长编码序列(CDS)cDNA序列的特征分析
BMC Genomics. 2005 Nov 23;6:166. doi: 10.1186/1471-2164-6-166.
7
GASS: genome structural annotation for Eukaryotes based on species similarity.GASS:基于物种相似性的真核生物基因组结构注释
BMC Genomics. 2015 Mar 4;16(1):150. doi: 10.1186/s12864-015-1353-3.
8
Gene structure prediction and alternative splicing analysis using genomically aligned ESTs.利用基因组比对的ESTs进行基因结构预测和可变剪接分析。
Genome Res. 2001 May;11(5):889-900. doi: 10.1101/gr.155001.
9
Predicting Genes in Single Genomes with AUGUSTUS.使用AUGUSTUS预测单基因组中的基因。
Curr Protoc Bioinformatics. 2019 Mar;65(1):e57. doi: 10.1002/cpbi.57. Epub 2018 Nov 22.
10
Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus.基于与同一基因组位点匹配的多个EST的一致性剪接比对进行基因结构预测。
Bioinformatics. 2004 May 1;20(7):1157-69. doi: 10.1093/bioinformatics/bth058. Epub 2004 Feb 5.

引用本文的文献

1
A comprehensive water buffalo pangenome reveals extensive structural variation linked to population-specific signatures of selection.一个全面的水牛泛基因组揭示了与群体特异性选择特征相关的广泛结构变异。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf099.
2
The near-complete genome assembly of provides insights into its origin, evolution, and the regulation of flavonoid biosynthesis.[具体物种名称]近乎完整的基因组组装为其起源、进化以及类黄酮生物合成的调控提供了见解。
Front Plant Sci. 2025 Aug 11;16:1580779. doi: 10.3389/fpls.2025.1580779. eCollection 2025.
3
Whole genome sequencing and annotations of Trametes sanguinea ZHSJ.
血红栓菌ZHSJ的全基因组测序及注释
Sci Data. 2025 Aug 21;12(1):1460. doi: 10.1038/s41597-025-05798-9.
4
Better together: Subgenomes for allotetraploid potato wild relative Solanum acaule Bitt. reveal origins in Petota Clade 3 and 4.携手共进:异源四倍体马铃薯野生近缘种智利茄的亚基因组揭示其起源于马铃薯进化分支3和4。
Plant Genome. 2025 Sep;18(3):e70095. doi: 10.1002/tpg2.70095.
5
SeqForge: A scalable platform for alignment-based searches, motif detection, and sequence curation across meta/genomic datasets.SeqForge:一个用于跨元基因组/基因组数据集进行基于比对的搜索、基序检测和序列整理的可扩展平台。
bioRxiv. 2025 Aug 15:2025.08.12.669971. doi: 10.1101/2025.08.12.669971.
6
A comparison of 27 Arabidopsis thaliana genomes and the path toward an unbiased characterization of genetic polymorphism.27个拟南芥基因组的比较以及遗传多态性无偏差表征的途径。
Nat Genet. 2025 Aug 19. doi: 10.1038/s41588-025-02293-0.
7
Navigating Eukaryotic Genome Annotation Pipelines: A Route Map to Using BRAKER, Galba, and TSEBRA.探索真核生物基因组注释流程:使用BRAKER、Galba和TSEBRA的路线图
Methods Mol Biol. 2025;2935:67-107. doi: 10.1007/978-1-0716-4583-3_4.
8
A chromosome-level Mitragyna parvifolia genome unveils spirooxindole alkaloid diversification and mitraphylline biosynthesis.染色体水平的小叶帽柱木基因组揭示了螺环氧化吲哚生物碱的多样性和帽柱木碱的生物合成。
Plant Cell. 2025 Sep 9;37(9). doi: 10.1093/plcell/koaf207.
9
The highly dynamic pangenome of basal chordates is enriched in defence and immunity genes and is inherited following the Mendelian law.基础脊索动物高度动态的泛基因组富含防御和免疫基因,并遵循孟德尔定律遗传。
PLoS Genet. 2025 Aug 18;21(8):e1011833. doi: 10.1371/journal.pgen.1011833. eCollection 2025 Aug.
10
BPA: a BERT-based priority annotation strategy for assessing the rationality of aquatic algal protein sequences.BPA:一种基于BERT的用于评估水藻蛋白质序列合理性的优先级注释策略。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf401.