Vallenet D, Engelen S, Mornico D, Cruveiller S, Fleury L, Lajus A, Rouy Z, Roche D, Salvignol G, Scarpelli C, Médigue C
CEA/DSV/IG/Genoscope-CNRS UMR8030, Laboratoire de Génomique Comparative (LGC), 2 rue Gaston Crémieux, 91057 Evry Cedex, France.
Database (Oxford). 2009;2009:bap021. doi: 10.1093/database/bap021. Epub 2009 Nov 25.
The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of microbial genome annotation, especially for genomes initially analyzed by automatic procedures alone.Database URLs: http://www.genoscope.cns.fr/agc/mage and http://www.genoscope.cns.fr/agc/microcyc.
基因组测序的最初结果是生成由四个字母组成的长文本字符串。计算机序列分析的作用是帮助生物学家将生物学知识与这些序列相关联,使研究人员能够做出可通过实验验证的推断和预测。科学界可获得各种各样的软件,这些软件可用于识别基因组对象,然后预测其生物学功能。然而,从一个孤立的序列中只能揭示出有限数量的具有生物学意义的特征。另一方面,比较基因组学工具通过同时整合众多基因组中包含的信息,使注释者能够基于进化和自然选择是所有生物过程定义的核心这一理念进行推断。我们开发了MicroScope平台,以提供一个基于网络的框架,用于系统、高效地修订微生物基因组注释和进行比较分析(http://www.genoscope.cns.fr/agc/microscope)。本文首先描述了MicroScope流程中实施的注释过程流程图,以及传统和新型微生物注释及比较分析工具的开发,强调了专家注释作为自动注释补充的重要作用。几个例子说明了在MicroScope丰富的综合基因组框架内,如何使用已实施的工具来审查和管理新的和公开可用的微生物基因组注释。该平台用作浏览器,以浏览可用微生物基因组(截至目前超过440种生物)的更新注释信息,以及在新的注释项目(117个细菌基因组)的背景下进行浏览。MicroScope数据库中收集的人类专业知识(约280,000条独立注释)有助于提高微生物基因组注释的质量,特别是对于最初仅通过自动程序分析的基因组。数据库网址:http://www.genoscope.cns.fr/agc/mage和http://www.genoscope.cns.fr/agc/microcyc。