Suppr超能文献

Restauro-G:一种用于比较基因组学的快速基因组重新注释系统。

Restauro-G: a rapid genome re-annotation system for comparative genomics.

作者信息

Tamaki Satoshi, Arakawa Kazuharu, Kono Nobuaki, Tomita Masaru

机构信息

Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan.

出版信息

Genomics Proteomics Bioinformatics. 2007 Feb;5(1):53-8. doi: 10.1016/S1672-0229(07)60014-X.

Abstract

Annotations of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLAST-Like Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/under the GNU General Public License.

摘要

直接从测序项目提交的完整基因组序列注释在注释策略和更新频率方面存在差异。这些不一致性使得比较研究变得困难。为了能够快速准备大量完整基因组的数据,自动化和速度对于基因组重新注释很重要。在这里,我们介绍了一个专门用于细菌基因组的开源快速基因组重新注释软件系统Restauro-G。Restauro-G通过使用类BLAST比对工具进行相似性搜索来重新注释基因组,参考诸如UniProt KB、NCBI nr、NCBI COGs、Pfam和PSORTb等蛋白质数据库。与EMBL版本的原始人工策划注释相比,Restauro-G对大多数细菌染色体的重新注释准确率超过了98%。Restauro-G是在通用生物信息学工作台G语言基因组分析环境中开发的,并根据GNU通用公共许可证在http://restauro-g.iab.keio.ac.jp/上分发。

相似文献

1
Restauro-G: a rapid genome re-annotation system for comparative genomics.
Genomics Proteomics Bioinformatics. 2007 Feb;5(1):53-8. doi: 10.1016/S1672-0229(07)60014-X.
2
BEACON: automated tool for Bacterial GEnome Annotation ComparisON.
BMC Genomics. 2015 Aug 18;16(1):616. doi: 10.1186/s12864-015-1826-4.
3
MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes.
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W471-9. doi: 10.1093/nar/gki498.
4
GenColors: accelerated comparative analysis and annotation of prokaryotic genomes at various stages of completeness.
Bioinformatics. 2005 Sep 15;21(18):3669-71. doi: 10.1093/bioinformatics/bti606. Epub 2005 Aug 2.
5
MaGe: a microbial genome annotation system supported by synteny results.
Nucleic Acids Res. 2006 Jan 10;34(1):53-65. doi: 10.1093/nar/gkj406. Print 2006.
9
A procedure for assessing GO annotation consistency.
Bioinformatics. 2005 Jun;21 Suppl 1:i136-43. doi: 10.1093/bioinformatics/bti1019.
10
GASS: genome structural annotation for Eukaryotes based on species similarity.
BMC Genomics. 2015 Mar 4;16(1):150. doi: 10.1186/s12864-015-1353-3.

引用本文的文献

1
Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing.
Biology (Basel). 2020 Sep 18;9(9):295. doi: 10.3390/biology9090295.
2
Comparative evaluation of intron prediction methods and detection of plant genome annotation using intron length distributions.
Genomics Inform. 2012 Mar;10(1):58-64. doi: 10.5808/GI.2012.10.1.58. Epub 2012 Mar 31.
3
Genome (re-)annotation and open-source annotation pipelines.
Microb Biotechnol. 2010 Jul;3(4):362-9. doi: 10.1111/j.1751-7915.2010.00191.x.
4
Genome Projector: zoomable genome map with multiple views.
BMC Bioinformatics. 2009 Jan 23;10:31. doi: 10.1186/1471-2105-10-31.

本文引用的文献

1
GenBank.
Nucleic Acids Res. 2007 Jan;35(Database issue):D21-5. doi: 10.1093/nar/gkl986.
2
PEDANT genome database: 10 years online.
Nucleic Acids Res. 2007 Jan;35(Database issue):D354-7. doi: 10.1093/nar/gkl1005. Epub 2006 Dec 5.
3
EMBL Nucleotide Sequence Database in 2006.
Nucleic Acids Res. 2007 Jan;35(Database issue):D16-20. doi: 10.1093/nar/gkl913. Epub 2006 Dec 5.
4
Ensembl 2007.
Nucleic Acids Res. 2007 Jan;35(Database issue):D610-7. doi: 10.1093/nar/gkl996. Epub 2006 Dec 5.
5
The Universal Protein Resource (UniProt).
Nucleic Acids Res. 2007 Jan;35(Database issue):D193-7. doi: 10.1093/nar/gkl929. Epub 2006 Nov 16.
6
NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5. doi: 10.1093/nar/gkl842. Epub 2006 Nov 27.
7
DDBJ working on evaluation and classification of bacterial genes in INSDC.
Nucleic Acids Res. 2007 Jan;35(Database issue):D13-5. doi: 10.1093/nar/gkl908. Epub 2006 Nov 15.
10
The Genomes On Line Database (GOLD) v.2: a monitor of genome projects worldwide.
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D332-4. doi: 10.1093/nar/gkj145.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验