Department of Microbiology, University of Manitoba, Winnipeg, Manitoba, Canada.
Syst Appl Microbiol. 2011 May;34(3):171-9. doi: 10.1016/j.syapm.2010.11.019. Epub 2011 Mar 9.
D.R. Zeigler determined that the sequence identity of bacterial genomes can be predicted accurately using the sequence identities of a corresponding set of genes that meet certain criteria [32]. This three-gene model for comparing bacterial genome pairs requires the determination of the sequence identities for recN, thdF, and rpoA. This involves the generation of approximately 4.2kb of genomic DNA sequence from each organism to be compared, and also normally requires that oligonucleotide primers be designed for amplification and sequencing based on the sequences of closely related organisms. However, we have developed an analogous mathematical model for predicting the sequence identity of whole genomes based on the sequence identity of the 542-567 base pair chaperonin-60 universal target (cpn60 UT). The cpn60 UT is accessible in nearly all bacterial genomes with a single set of universal primers, and its length is such that it can be completely sequenced in one pair of overlapping sequencing reads via di-deoxy sequencing. These mathematical models were applied to a set of Thermoanaerobacter isolates from a wood chip compost pile and it was shown that both the one-gene cpn60 UT-based model and the three-gene model based on recN, rpoA, and thdF predicted that these isolates could be classified as Thermoanaerobacter thermohydrosulfuricus. Furthermore, it was found that the genomic prediction model using cpn60 UT gave similar results to whole-genome sequence alignments over a broad range of taxa, suggesting that this method may have general utility for screening isolates and predicting their taxonomic affiliations.
D.R. Zeigler 确定可以使用满足某些标准的相应基因集的序列同一性准确预测细菌基因组的序列同一性[32]。这种用于比较细菌基因组对的三基因模型需要确定 recN、thdF 和 rpoA 的序列同一性。这涉及从要比较的每个生物体生成大约 4.2kb 的基因组 DNA 序列,并且通常还需要基于密切相关的生物体的序列设计用于扩增和测序的寡核苷酸引物。然而,我们已经开发了一种类似的基于 542-567 碱基对伴侣蛋白-60 通用靶标(cpn60 UT)序列同一性预测整个基因组序列同一性的数学模型。cpn60 UT 可在几乎所有细菌基因组中使用一组通用引物获得,并且其长度使得可以通过双脱氧测序通过一对重叠测序读取完全测序。将这些数学模型应用于来自木屑堆肥堆的一组 Thermoanaerobacter 分离物,结果表明,基于单基因 cpn60 UT 的模型和基于 recN、rpoA 和 thdF 的三基因模型都预测这些分离物可以归类为 Thermoanaerobacter thermohydrosulfuricus。此外,发现使用 cpn60 UT 的基因组预测模型在广泛的分类群中与全基因组序列比对产生了相似的结果,这表明该方法可能具有用于筛选分离物和预测其分类归属的普遍适用性。