Feten Guri, Almøy Trygve, Snipen Lars, Aakra Agot, Nyquist O Ludvig, Aastveit Are H
Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, P.O. Box 5003, N-1432 As, Norway.
Biom J. 2007 Apr;49(2):242-58. doi: 10.1002/bimj.200510286.
Comparative genomic hybridization (CGH) using microarrays is performed on bacteria in order to test for genomic diversity within various bacterial species. The microarrays used for CGH are based on the genome of a fully sequenced bacterium strain, denoted reference strain. Labelled DNA fragments from a sample strain of interest and from the reference strain are hybridized to the array. Based on the obtained ratio intensities and the total intensities of the signals, each gene is classified as either present (one copy or multiple copies) or divergent (zero copies). In this paper mixture models with different number of components are tted on different combinations of variables and compared with each other. The study shows that mixture models fitted on both the ratio intensities and the total intensities including the replicates for each gene improve, compared to previously published methods, the results for several of the data sets tested. Some summaries of the data sets are proposed as a guide for the choice of model and the choice of number of components. The models are applied on data from CGH experiments with the bacteria Staphylococcus aureus and
使用微阵列的比较基因组杂交(CGH)技术应用于细菌,以检测不同细菌物种内的基因组多样性。用于CGH的微阵列基于一个已完全测序的细菌菌株的基因组,该菌株被称为参考菌株。来自感兴趣的样本菌株和参考菌株的标记DNA片段与阵列进行杂交。根据获得的比率强度和信号的总强度,每个基因被分类为存在(一个拷贝或多个拷贝)或缺失(零拷贝)。在本文中,具有不同成分数量的混合模型被拟合到不同的变量组合上,并相互比较。研究表明,与先前发表的方法相比,同时拟合比率强度和总强度(包括每个基因的重复数据)的混合模型改进了几个测试数据集的结果。提出了一些数据集的总结,作为模型选择和成分数量选择的指导。这些模型应用于金黄色葡萄球菌等细菌的CGH实验数据。