Choi Kwangmin, Yang Youngik, Kim Sun
School of Informatics, Indiana University, USA.
Methods Mol Biol. 2007;395:133-46.
Recent advances in genome sequencing technology and algorithms have made it possible to determine the sequence of a whole genome quickly in a cost-effective manner. As a result, there are more than 200 completely sequenced genomes. However, annotation of a genome is still a challenging task. One of the most effective methods to annotate a newly sequenced genome is to compare it with well-annotated and closely related genomes using computational tools and databases. Comparing genomes requires use of a number of computational tools and produces a large amount of output, which should be analyzed by genome annotators. Because of this difficulty, genome projects are mostly carried out at large genome sequencing centers. To alleviate the requirement for expert knowledge in computational tools and databases, we have developed a web-based genome annotation system, called CGAS (a comparative genome annotation system; http://platcom.org/CGAS). This chapter describes how to use CGAS and necessary background knowledge on the computational tools and resources. As an example, a Bacillus subtilis genome is considered as an unannotated target genome and compared with several reference genomes, including Bacillus halodurans, Oceanobacillus iheyensis HTE831, and Bacillus cereus group genomes (representative strain of Bacillus. cereus, Bacillus anthracis).
基因组测序技术和算法的最新进展使得以经济高效的方式快速确定全基因组序列成为可能。因此,已有200多个全基因组被完全测序。然而,基因组注释仍然是一项具有挑战性的任务。注释新测序基因组最有效的方法之一是使用计算工具和数据库将其与注释完善且亲缘关系密切的基因组进行比较。比较基因组需要使用多种计算工具,并产生大量输出,这些输出应由基因组注释人员进行分析。由于存在这种困难,基因组项目大多在大型基因组测序中心开展。为减轻对计算工具和数据库方面专业知识的要求,我们开发了一个基于网络的基因组注释系统,称为CGAS(比较基因组注释系统;http://platcom.org/CGAS)。本章介绍如何使用CGAS以及关于计算工具和资源的必要背景知识。例如,将枯草芽孢杆菌基因组视为未注释的目标基因组,并与几个参考基因组进行比较,包括嗜碱芽孢杆菌、伊贺海洋芽孢杆菌HTE831和蜡样芽孢杆菌群基因组(蜡样芽孢杆菌、炭疽芽孢杆菌的代表性菌株)。