Tamaki Satoshi, Arakawa Kazuharu, Kono Nobuaki, Tomita Masaru
Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan.
Genomics Proteomics Bioinformatics. 2007 Feb;5(1):53-8. doi: 10.1016/S1672-0229(07)60014-X.
Annotations of complete genome sequences submitted directly from sequencing projects are diverse in terms of annotation strategies and update frequencies. These inconsistencies make comparative studies difficult. To allow rapid data preparation of a large number of complete genomes, automation and speed are important for genome re-annotation. Here we introduce an open-source rapid genome re-annotation software system, Restauro-G, specialized for bacterial genomes. Restauro-G re-annotates a genome by similarity searches utilizing the BLAST-Like Alignment Tool, referring to protein databases such as UniProt KB, NCBI nr, NCBI COGs, Pfam, and PSORTb. Re-annotation by Restauro-G achieved over 98% accuracy for most bacterial chromosomes in comparison with the original manually curated annotation of EMBL releases. Restauro-G was developed in the generic bioinformatics workbench G-language Genome Analysis Environment and is distributed at http://restauro-g.iab.keio.ac.jp/under the GNU General Public License.
直接从测序项目提交的完整基因组序列注释在注释策略和更新频率方面存在差异。这些不一致性使得比较研究变得困难。为了能够快速准备大量完整基因组的数据,自动化和速度对于基因组重新注释很重要。在这里,我们介绍了一个专门用于细菌基因组的开源快速基因组重新注释软件系统Restauro-G。Restauro-G通过使用类BLAST比对工具进行相似性搜索来重新注释基因组,参考诸如UniProt KB、NCBI nr、NCBI COGs、Pfam和PSORTb等蛋白质数据库。与EMBL版本的原始人工策划注释相比,Restauro-G对大多数细菌染色体的重新注释准确率超过了98%。Restauro-G是在通用生物信息学工作台G语言基因组分析环境中开发的,并根据GNU通用公共许可证在http://restauro-g.iab.keio.ac.jp/上分发。