Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland, United States of America.
PLoS One. 2009 Nov 24;4(11):e7978. doi: 10.1371/journal.pone.0007978.
Array-based comparative genomics hybridization (aCGH) has gained prevalence as an effective technique for measuring structural variations in the genome. Copy-number variations (CNVs) form a large source of genomic structural variation, but it is not known whether phenotypic differences between intra-species groups, such as divergent human populations, or breeds of a domestic animal, can be attributed to CNVs. Several computational methods have been proposed to improve the detection of CNVs from array CGH data, but few population studies have used CGH data for identification of intra-species differences. In this paper we propose a novel method of genome-wide comparison and classification using CGH data that condenses whole genome information, aimed at quantification of intra-species variations and discovery of shared ancestry. Our strategy included smoothing CGH data using an appropriate denoising algorithm, extracting features via wavelets, quantifying the information via wavelet power spectrum and hierarchical clustering of the resultant profile. To evaluate the classification efficiency of our method, we used simulated data sets. We applied it to aCGH data from human and bovine individuals and showed that it successfully detects existing intra-specific variations with additional evolutionary implications.
基于阵列的比较基因组杂交(aCGH)已成为测量基因组结构变异的有效技术。拷贝数变异(CNV)是基因组结构变异的主要来源,但尚不清楚物种内群体(如不同的人类群体或家畜品种)之间的表型差异是否可以归因于 CNV。已经提出了几种计算方法来提高从阵列 CGH 数据中检测 CNV 的能力,但很少有群体研究使用 CGH 数据来识别物种内差异。在本文中,我们提出了一种使用 CGH 数据进行全基因组比较和分类的新方法,该方法可以浓缩全基因组信息,旨在量化物种内的变异并发现共同的祖先。我们的策略包括使用适当的去噪算法平滑 CGH 数据,通过小波提取特征,通过小波功率谱量化信息,并对所得图谱进行层次聚类。为了评估我们方法的分类效率,我们使用了模拟数据集。我们将其应用于人类和牛个体的 aCGH 数据,并表明它成功地检测到了具有额外进化意义的现有种内变异。