Division of Structural and Functional Genomics, Center for Genome Science, Korea National Institute of Health, Osong, Korea.
Eur J Hum Genet. 2011 Nov;19(11):1167-72. doi: 10.1038/ejhg.2011.103. Epub 2011 Jul 6.
To date, hundreds of thousands of copy-number variation (CNV) data have been reported using various platforms. The proportion of Asians in these data is, however, relatively small as compared with that of other ethnic groups, such as Caucasians and Yorubas. Because of limitations in platform resolution and the high noise level in signal intensity, in most CNV studies (particularly those using single nucleotide polymorphism arrays), the average number of CNVs in an individual is less than the number of known CNVs. In this study, we ascertained reliable, common CNV regions (CNVRs) and identified actual frequency rates in the Korean population to provide more CNV information. We performed two-stage analyses for detecting structural variations with two platforms. We discovered 576 common CNVRs (88 CNV segments on average in an individual), and 87% (501 of 576) of these CNVRs overlapped by ≥1 bp with previously validated CNV events. Interestingly, from the frequency analysis of CNV profiles, 52 of 576 CNVRs had a frequency rate of <1% in the 8842 individuals. Compared with other common CNV studies, this study found six common CNVRs that were not reported in previous CNV studies. In conclusion, we propose the data-driven detection approach to discover common CNVRs including those of unreported in the previous Korean CNV study while minimizing false positives. Through our approach, we successfully discovered more common CNVRs than previous Korean CNV study and conducted frequency analysis. These results will be a valuable resource for the effective level of CNVs in the Korean population.
迄今为止,已经使用各种平台报道了数十万例拷贝数变异 (CNV) 数据。然而,与其他族裔群体(如高加索人和约鲁巴人)相比,亚洲人的数据比例相对较小。由于平台分辨率的限制以及信号强度的高噪声水平,在大多数 CNV 研究(特别是使用单核苷酸多态性阵列的研究)中,个体中的 CNV 数量平均少于已知的 CNV。在这项研究中,我们确定了可靠的、常见的 CNV 区域 (CNVR),并确定了韩国人群中的实际频率率,以提供更多的 CNV 信息。我们使用两个平台进行了两阶段分析以检测结构变异。我们发现了 576 个常见的 CNVR(个体中平均有 88 个 CNV 片段),其中 87%(501 个中的 576 个)与以前验证的 CNV 事件至少有≥1bp 重叠。有趣的是,从 CNV 谱的频率分析来看,在 8842 个人中,52 个 CNVR 的频率率<1%。与其他常见的 CNV 研究相比,这项研究发现了 6 个以前的 CNV 研究中未报道的常见 CNVR。总之,我们提出了一种数据驱动的检测方法来发现常见的 CNVR,包括以前韩国 CNV 研究中未报告的 CNVR,同时最小化假阳性。通过我们的方法,我们成功地发现了比以前的韩国 CNV 研究更多的常见 CNVR,并进行了频率分析。这些结果将成为韩国人群中 CNV 有效水平的有价值资源。