MRC Centre for Causal Analyses in Translational Epidemiology and Bristol Genetic Epidemiology Laboratories, Department of Social Medicine, University of Bristol, Oakfield House, Bristol, United Kingdom.
Hum Mutat. 2010 Apr;31(4):414-20. doi: 10.1002/humu.21199.
Copy number variations (CNVs) are a common form of genetic variation in which the allelic population contains a distribution of copy numbers of a particular gene (or other large sequence/region). The simplest forms describe deletion (0 vs. 1 copy) or duplication (1 vs. 2) events. However, some CNV loci contain a much wider range of copy numbers, such as that seen for the CCL3L1 locus. CNV classification methods typically only describe the total (diploid) copy number, leaving the underlying genotypic and allelic frequency distribution unknown. We have developed an expectation-maximization approach for the analysis of data from tandem CNVs that enables estimation of both the allelic copy number frequency distribution and the expected copy number genotype and class distribution under the Hardy-Weinberg equilibrium (HWE). The CNV expectation-maximization algorithm is available in a Web-tool (CoNVEM, http://apps.biocompute.org.uk/convem/), which graphically and numerically presents CNV allele and genotype distributions. We have applied this approach to the analysis of salivary amylase (AMY1A, B, and C), CCL3L1, and SULT1A1 CNVs using published data, and present inferences about the evolutionary history of these loci based on CoNVEM results.
拷贝数变异 (CNVs) 是一种常见的遗传变异形式,等位基因群体中包含特定基因(或其他大序列/区域)的拷贝数分布。最简单的形式描述了缺失(0 与 1 拷贝)或重复(1 与 2 拷贝)事件。然而,一些 CNV 位点包含更广泛的拷贝数范围,例如 CCL3L1 位点。CNV 分类方法通常仅描述总(二倍体)拷贝数,而潜在的基因型和等位基因频率分布则未知。我们开发了一种用于串联 CNV 数据分析的期望最大化方法,该方法能够估计等位基因拷贝数频率分布以及 Hardy-Weinberg 平衡(HWE)下预期的拷贝数基因型和类别分布。CNV 期望最大化算法可在 Web 工具(CoNVEM,http://apps.biocompute.org.uk/convem/)中使用,该工具以图形和数字形式呈现 CNV 等位基因和基因型分布。我们已应用该方法分析唾液淀粉酶(AMY1A、B 和 C)、CCL3L1 和 SULT1A1 的 CNV,使用已发表的数据,并根据 CoNVEM 结果推断这些位点的进化历史。