Department of Biostatistics, University of Alabama at Birmingham Birmingham, AL, USA.
Front Genet. 2011 Apr 25;2:17. doi: 10.3389/fgene.2011.00017. eCollection 2011.
Since the discovery of the ubiquitous contribution of copy number variation to genetic variability, researchers have commonly used metrics such as r (2) to quantify linkage disequilibrium (LD) between copy number variants (CNVs) and single nucleotide polymorphisms (SNPs). However, these reports have been restricted to SNPs outside copy number variable regions (CNVR) as current methods have not been adapted to account for SNPs displaying variable copy number. We show that traditional LD metrics inappropriately quantify SNP/CNV covariance when SNPs lie within CNVR. We derive a new method for measuring LD that solves this issue, and defaults to traditional metrics otherwise. Finally, we present a procedure to estimate CNV-SNP allele frequencies from unphased CNV-SNP genotypes. Our method allows researchers to include all SNPs in SNP/CNV LD measurements, regardless of copy number.
自从发现拷贝数变异对遗传变异的普遍贡献以来,研究人员通常使用 r(2) 等指标来量化拷贝数变异 (CNV) 与单核苷酸多态性 (SNP) 之间的连锁不平衡 (LD)。然而,这些报告仅限于 CNV 外的 SNPs,因为当前的方法尚未适应于具有可变拷贝数的 SNPs。我们表明,当 SNPs 位于 CNVR 内时,传统的 LD 指标不适当地量化 SNP/CNV 协方差。我们推导出一种新的测量 LD 的方法来解决这个问题,否则则采用传统的指标。最后,我们提出了一种从未相化的 CNV-SNP 基因型中估计 CNV-SNP 等位基因频率的程序。我们的方法允许研究人员将所有 SNPs 包括在 SNP/CNV LD 测量中,而无需考虑拷贝数。