Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
Proc Natl Acad Sci U S A. 2012 Jan 17;109(3):E103-10. doi: 10.1073/pnas.1106233109. Epub 2011 Dec 29.
Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely "self-self" hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piecewise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.
基因组拷贝数变异是导致自闭症、精神分裂症和先天性心脏病等遗传疾病的基础。通过对样本和参考 DNA 进行基于阵列的比较基因组杂交,可以检测到拷贝数变异,但探针和操作变量的组合会产生相关的系统噪声,从而降低对遗传事件的检测能力。为了解决这个问题,我们探索了在没有预期遗传信号的情况下进行杂交,即“自-自”杂交(SSH),比较来自同一基因组的 DNA。我们表明,SSH 可以捕获样品-参考(测试)数据中存在的各种相关系统噪声。通过 SSH 的奇异值分解,我们能够确定这些噪声的主成分(PCs)。这些 PCs 本身深入揭示了噪声的来源,并有助于发现伪影。我们提供的证据表明,用 PCs 对测试数据进行线性和分段线性校正不会引入可检测的虚假信号,但可以提高信噪比度量,减少假阳性,并有助于确定拷贝数。