Argimón Silvia, Konganti Kranti, Chen Hao, Alekseyenko Alexander V, Brown Stuart, Caufield Page W
New York University College of Dentistry, Department of Cariology and Comprehensive Care, 345 East 24th St, New York, NY 10010, USA.
Center for Health Informatics and Bioinformatics, New York University School of Medicine, 227 East 30th St, New York, NY 10016, USA.
Infect Genet Evol. 2014 Jan;21:269-78. doi: 10.1016/j.meegid.2013.11.003. Epub 2013 Nov 26.
Comparative genomics is a popular method for the identification of microbial virulence determinants, especially since the sequencing of a large number of whole bacterial genomes from pathogenic and non-pathogenic strains has become relatively inexpensive. The bioinformatics pipelines for comparative genomics usually include gene prediction and annotation and can require significant computer power. To circumvent this, we developed a rapid method for genome-scale in silico subtractive hybridization, based on blastn and independent of feature identification and annotation. Whole genome comparisons by in silico genome subtraction were performed to identify genetic loci specific to Streptococcus mutans strains associated with severe early childhood caries (S-ECC), compared to strains isolated from caries-free (CF) children. The genome similarity of the 20 S. mutans strains included in this study, calculated by Simrank k-mer sharing, ranged from 79.5% to 90.9%, confirming this is a genetically heterogeneous group of strains. We identified strain-specific genetic elements in 19 strains, with sizes ranging from 200 to 39 kb. These elements contained protein-coding regions with functions mostly associated with mobile DNA. We did not, however, identify any genetic loci consistently associated with dental caries, i.e., shared by all the S-ECC strains and absent in the CF strains. Conversely, we did not identify any genetic loci specific with the healthy group. Comparison of previously published genomes from pathogenic and carriage strains of Neisseria meningitidis with our in silico genome subtraction yielded the same set of genes specific to the pathogenic strains, thus validating our method. Our results suggest that S. mutans strains derived from caries active or caries free dentitions cannot be differentiated based on the presence or absence of specific genetic elements. Our in silico genome subtraction method is available as the Microbial Genome Comparison (MGC) tool, with a user-friendly JAVA graphical interface.
比较基因组学是一种用于鉴定微生物毒力决定因素的常用方法,特别是自从对大量致病性和非致病性菌株的全细菌基因组进行测序变得相对便宜以来。比较基因组学的生物信息学流程通常包括基因预测和注释,并且可能需要大量的计算机算力。为了规避这一问题,我们基于blastn开发了一种快速的全基因组规模的电子减法杂交方法,该方法独立于特征识别和注释。通过电子基因组减法进行全基因组比较,以鉴定与重度幼儿早期龋齿(S-ECC)相关的变形链球菌菌株特有的基因座,并与从无龋(CF)儿童中分离出的菌株进行比较。通过Simrank k-mer共享计算,本研究中纳入的20株变形链球菌菌株的基因组相似度在79.5%至90.9%之间,证实这是一组基因异质性的菌株。我们在19株菌株中鉴定出了菌株特异性的遗传元件,其大小从200到39 kb不等。这些元件包含蛋白质编码区域,其功能大多与可移动DNA相关。然而,我们没有鉴定出任何与龋齿始终相关的基因座,即所有S-ECC菌株都具有而CF菌株中不存在的基因座。相反,我们也没有鉴定出任何健康组特有的基因座。将先前发表的脑膜炎奈瑟菌致病菌株和携带菌株的基因组与我们的电子基因组减法结果进行比较,得到了相同的一组致病菌株特有的基因,从而验证了我们的方法。我们的结果表明,不能根据特定遗传元件的存在与否来区分来自有龋或无龋牙列的变形链球菌菌株。我们的电子基因组减法方法可作为微生物基因组比较(MGC)工具使用,具有用户友好的JAVA图形界面。