Department of Biostatistics, University of Kentucky, Lexington, Kentucky, United States of America.
PLoS One. 2012;7(4):e34262. doi: 10.1371/journal.pone.0034262. Epub 2012 Apr 6.
Recently, structural variation in the genome has been implicated in many complex diseases. Using genomewide single nucleotide polymorphism (SNP) arrays, researchers are able to investigate the impact not only of SNP variation, but also of copy-number variants (CNVs) on the phenotype. The most common analytic approach involves estimating, at the level of the individual genome, the underlying number of copies present at each location. Once this is completed, tests are performed to determine the association between copy number state and phenotype. An alternative approach is to carry out association testing first, between phenotype and raw intensities from the SNP array at the level of the individual marker, and then aggregate neighboring test results to identify CNVs associated with the phenotype. Here, we explore the strengths and weaknesses of these two approaches using both simulations and real data from a pharmacogenomic study of the chemotherapeutic agent gemcitabine. Our results indicate that pooled marker-level testing is capable of offering a dramatic increase in power (> 12-fold) over CNV-level testing, particularly for small CNVs. However, CNV-level testing is superior when CNVs are large and rare; understanding these tradeoffs is an important consideration in conducting association studies of structural variation.
最近,基因组结构变异与许多复杂疾病有关。利用全基因组单核苷酸多态性 (SNP) 芯片,研究人员不仅能够研究 SNP 变异,还能够研究拷贝数变异 (CNV) 对表型的影响。最常见的分析方法包括在个体基因组水平上估计每个位置存在的拷贝数。完成此操作后,将进行测试以确定拷贝数状态与表型之间的关联。另一种方法是首先在个体标记水平上进行表型和 SNP 阵列原始强度之间的关联测试,然后聚合相邻的测试结果以识别与表型相关的 CNV。在这里,我们使用化疗药物吉西他滨的药物基因组学研究的模拟数据和真实数据来探索这两种方法的优缺点。我们的结果表明,与 CNV 水平测试相比,汇集标记水平测试能够显著提高功效(> 12 倍),特别是对于小的 CNV。然而,当 CNV 较大且罕见时,CNV 水平测试更具优势;了解这些权衡是进行结构变异关联研究的重要考虑因素。