Yu G X, Snyder E E, Boyle S M, Crasta O R, Czar M, Mane S P, Purkayastha A, Sobral B, Setubal J C
Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24061, USA.
Nucleic Acids Res. 2007;35(12):3953-62. doi: 10.1093/nar/gkm377. Epub 2007 Jun 6.
We present a bacterial genome computational analysis pipeline, called GenVar. The pipeline, based on the program GeneWise, is designed to analyze an annotated genome and automatically identify missed gene calls and sequence variants such as genes with disrupted reading frames (split genes) and those with insertions and deletions (indels). For a given genome to be analyzed, GenVar relies on a database containing closely related genomes (such as other species or strains) as well as a few additional reference genomes. GenVar also helps identify gene disruptions probably caused by sequencing errors. We exemplify GenVar's capabilities by presenting results from the analysis of four Brucella genomes. Brucella is an important human pathogen and zoonotic agent. The analysis revealed hundreds of missed gene calls, new split genes and indels, several of which are species specific and hence provide valuable clues to the understanding of the genome basis of Brucella pathogenicity and host specificity.
我们展示了一种名为GenVar的细菌基因组计算分析流程。该流程基于GeneWise程序,旨在分析已注释的基因组,并自动识别遗漏的基因调用以及序列变异,例如具有破坏阅读框的基因(分裂基因)和那些存在插入和缺失(插入缺失)的基因。对于给定要分析的基因组,GenVar依赖于一个包含密切相关基因组(如其他物种或菌株)以及一些额外参考基因组的数据库。GenVar还有助于识别可能由测序错误导致的基因破坏。我们通过展示对四个布鲁氏菌基因组的分析结果来例证GenVar的能力。布鲁氏菌是一种重要的人类病原体和人畜共患病原体。分析揭示了数百个遗漏的基因调用、新的分裂基因和插入缺失,其中一些是物种特异性的,因此为理解布鲁氏菌致病性和宿主特异性的基因组基础提供了有价值的线索。