Suppr超能文献

SGP-1:基于序列比对的同源基因预测与验证

SGP-1: prediction and validation of homologous genes based on sequence alignments.

作者信息

Wiehe T, Gebauer-Jung S, Mitchell-Olds T, Guigó R

机构信息

Max Planck Institute for Chemical Ecology, Jena, Germany.

出版信息

Genome Res. 2001 Sep;11(9):1574-83. doi: 10.1101/gr.177401.

Abstract

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of depends little on species-specific properties such as codon usage or the nucleotide distribution. may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.

摘要

传统的基因预测方法依赖于对DNA序列信号的识别、编码潜力的分析,或者将基因组序列与cDNA、EST或蛋白质数据库进行比较。在许多情况下,预测准确性有限的原因包括物种特异性训练以及参考数据库的不完整性。最近,比较基因组分析受到了越来越多的关注。目前已经有几种基于人类/小鼠比较的分析工具。在这里,我们提出了一个用于预测蛋白质编码基因的程序,称为SGP-1(同线基因预测),它基于同源基因组序列的相似性。与大多数现有工具不同,其准确性几乎不依赖于密码子使用或核苷酸分布等物种特异性特性。因此,它可以应用于脊椎动物以及植物中的非标准模式生物,而无需进行广泛的参数训练。除了预测大规模基因组序列中的基因外,该程序还可用于验证数据库中的基因结构注释。为此,SGP-1的输出还包含以HTML格式呈现的预测基因结构与注释基因结构之间的比较。该程序可通过网页服务器http://soft.ice.mpg.de/sgp-1访问。用ANSI C编写的源代码可根据作者要求提供。

相似文献

引用本文的文献

3
Whole-Genome Alignment and Comparative Annotation.全基因组比对和注释。
Annu Rev Anim Biosci. 2019 Feb 15;7:41-64. doi: 10.1146/annurev-animal-020518-115005. Epub 2018 Oct 31.
7
Finding protein-coding genes through human polymorphisms.通过人类多态性发现蛋白质编码基因。
PLoS One. 2013;8(1):e54210. doi: 10.1371/journal.pone.0054210. Epub 2013 Jan 22.
8
Approaches to Fungal Genome Annotation.真菌基因组注释方法。
Mycology. 2011 Oct 3;2(3):118-141. doi: 10.1080/21501203.2011.606851.
10
Testing the coding potential of conserved short genomic sequences.测试保守短基因组序列的编码潜力。
Adv Bioinformatics. 2010;2010:287070. doi: 10.1155/2010/287070. Epub 2010 Mar 8.

本文引用的文献

2
The origins of genomic duplications in Arabidopsis.拟南芥基因组重复的起源。
Science. 2000 Dec 15;290(5499):2114-7. doi: 10.1126/science.290.5499.2114.
10
Alignment of whole genomes.全基因组比对
Nucleic Acids Res. 1999 Jun 1;27(11):2369-76. doi: 10.1093/nar/27.11.2369.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验