Brodie Ryan, Smith Alex J, Roper Rachel L, Tcherepanov Vasily, Upton Chris
Biochemistry and Microbiology, University of Victoria, B,C, V8W 3P6 Canada.
BMC Bioinformatics. 2004 Jul 14;5:96. doi: 10.1186/1471-2105-5-96.
With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools.
A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files.
Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
随着越来越多密切相关的病毒基因组被测序,人们希望能够在比基因内容更详细的层面上比较两个基因组,因为一种生物的两个菌株可能共享相同的一组预测基因,但它们的致病性特征仍可能不同。例如,在没有新的生物信息学工具的情况下,对天花病毒基因组的多个分离株(每个约200 kb,有200个基因)进行详细比较是不可行的。
已经开发了一个名为“逐碱基”的软件包,它提供可视化工具,使研究人员能够:1)快速识别并纠正大型多基因组比对中的比对错误;2)生成核苷酸水平上基因组间差异的表格和图形输出。“逐碱基”使用有关比对基因组的详细注释信息,并可以列出每个有核苷酸差异的预测基因,显示变异是否发生在启动子区域或编码区域,以及这些变化是否导致氨基酸替换。“逐碱基”可以连接到我们的MySQL数据库(病毒直系同源簇;VOCs)以检索有关比对基因组的详细注释信息,或使用来自文本文件的信息。
“逐碱基”使用户能够快速轻松地比较大型病毒基因组;它突出显示了可能导致重要表型差异(如毒力)的微小差异。它可通过Java Web Start在互联网上获取,并在配备Java 1.4虚拟机的Macintosh、PC和Linux操作系统上运行。