Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington 98195-7720, USA.
Genet Epidemiol. 2012 Jul;36(5):488-98. doi: 10.1002/gepi.21643. Epub 2012 May 24.
Copy Number Variation (CNV) is increasingly implicated in disease pathogenesis. CNVs are often identified by statistical models applied to data from single nucleotide polymorphism panels. Family information for samples provides additional information for CNV inference. Two modes of PennCNV (the Joint-call and Posterior-call), which are some of the most well-developed family-based CNV calling methods, use a "Joint-model" as a main component. This models all family members' CNV states together with Mendelian inheritance. Methods based on the Joint-model are used to infer CNV calls of cases and controls in a pedigree, which may be compared to each other to test an association. Although benefits from the Joint-model have been shown elsewhere, equality of call rates in parents and offspring has not been evaluated previously. This can affect downstream analyses in studies that compare CNV rates in cases vs. controls in pedigrees. In this paper, we show that the Joint-model can introduce different CNV call rates among family members in the absence of a true difference. We show that the Joint-model may analytically introduce differential CNV calls because of asymmetry of the model. We demonstrate these differential call rates using single-marker simulations. We show that call rates using the two modes of PennCNV also differ between parents and offspring in one multimarker simulated dataset and two real datasets. Our results advise need for caution in use of the Joint-model calls in CNV association studies with family-based datasets.
拷贝数变异(CNV)越来越多地与疾病发病机制有关。CNV 通常通过应用于单核苷酸多态性面板数据的统计模型来识别。样本的家族信息为 CNV 推断提供了额外的信息。PennCNV(联合调用和后验调用)的两种模式是一些最发达的基于家族的 CNV 调用方法,它们使用“联合模型”作为主要组成部分。该模型共同建模了所有家族成员的 CNV 状态以及孟德尔遗传。基于联合模型的方法用于推断家系中病例和对照的 CNV 调用,这些调用可以相互比较以测试关联。尽管联合模型的优势在其他地方已经得到了证明,但父母和子女的调用率是否相等以前没有得到评估。这可能会影响在家系中比较病例与对照的 CNV 率的研究中的下游分析。在本文中,我们表明,在不存在真实差异的情况下,联合模型可以在家庭成员之间引入不同的 CNV 调用率。我们表明,联合模型可能会由于模型的不对称性而在分析中引入差异的 CNV 调用。我们使用单标记模拟演示了这些差异的调用率。我们表明,在一个多标记模拟数据集和两个真实数据集上,PennCNV 的两种模式也会导致父母和子女之间的调用率存在差异。我们的研究结果建议在使用基于家族的数据集进行 CNV 关联研究时,对联合模型的调用要谨慎。