Diskin Sharon J, Li Mingyao, Hou Cuiping, Yang Shuzhang, Glessner Joseph, Hakonarson Hakon, Bucan Maja, Maris John M, Wang Kai
Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
Nucleic Acids Res. 2008 Nov;36(19):e126. doi: 10.1093/nar/gkn556. Epub 2008 Sep 10.
Whole-genome microarrays with large-insert clones designed to determine DNA copy number often show variation in hybridization intensity that is related to the genomic position of the clones. We found these 'genomic waves' to be present in Illumina and Affymetrix SNP genotyping arrays, confirming that they are not platform-specific. The causes of genomic waves are not well-understood, and they may prevent accurate inference of copy number variations (CNVs). By measuring DNA concentration for 1444 samples and by genotyping the same sample multiple times with varying DNA quantity, we demonstrated that DNA quantity correlates with the magnitude of waves. We further showed that wavy signal patterns correlate best with GC content, among multiple genomic features considered. To measure the magnitude of waves, we proposed a GC-wave factor (GCWF) measure, which is a reliable predictor of DNA quantity (correlation coefficient = 0.994 based on samples with serial dilution). Finally, we developed a computational approach by fitting regression models with GC content included as a predictor variable, and we show that this approach improves the accuracy of CNV detection. With the wide application of whole-genome SNP genotyping techniques, our wave adjustment method will be important for taking full advantage of genotyped samples for CNV analysis.
旨在确定DNA拷贝数的带有大插入片段克隆的全基因组微阵列,常常显示出与克隆的基因组位置相关的杂交强度变化。我们发现这些“基因组波”存在于Illumina和Affymetrix SNP基因分型阵列中,证实它们并非特定于某个平台。基因组波的成因尚未得到很好的理解,它们可能会妨碍对拷贝数变异(CNV)的准确推断。通过测量1444个样本的DNA浓度,并对同一样本使用不同的DNA量进行多次基因分型,我们证明了DNA量与波的幅度相关。在考虑的多个基因组特征中,我们进一步表明波浪状信号模式与GC含量的相关性最佳。为了测量波的幅度,我们提出了一种GC波因子(GCWF)测量方法,它是DNA量的可靠预测指标(基于系列稀释样本的相关系数 = 0.994)。最后,我们通过将GC含量作为预测变量纳入回归模型拟合,开发了一种计算方法,并且我们表明这种方法提高了CNV检测的准确性。随着全基因组SNP基因分型技术的广泛应用,我们的波调整方法对于充分利用基因分型样本进行CNV分析将具有重要意义。