Kang Tae-Wook, Jeon Yeo-Jin, Jang Eunsu, Kim Hee-Jin, Kim Jeong-Hwan, Park Jong-Lyul, Lee Siwoo, Kim Yong Sung, Kim Jong Yeol, Kim Seon-Young
Medical Genomics Research Center, KRIBB, 52 Eoeun-dong, Yuseong-gu, Daejeon 305-806, Republic of Korea.
BMC Genomics. 2008 Oct 18;9:492. doi: 10.1186/1471-2164-9-492.
Copy number variations (CNVs) are deletions, insertions, duplications, and more complex variations ranging from 1 kb to sub-microscopic sizes. Recent advances in array technologies have enabled researchers to identify a number of CNVs from normal individuals. However, the identification of new CNVs has not yet reached saturation, and more CNVs from diverse populations remain to be discovered.
We identified 65 copy number variation regions (CNVRs) in 116 normal Korean individuals by analyzing Affymetrix 250 K Nsp whole-genome SNP data. Ten of these CNVRs were novel and not present in the Database of Genomic Variants (DGV). To increase the specificity of CNV detection, three algorithms, CNAG, dChip and GEMCA, were applied to the data set, and only those regions recognized at least by two algorithms were identified as CNVs. Most CNVRs identified in the Korean population were rare (<1%), occurring just once among the 116 individuals. When CNVs from the Korean population were compared with CNVs from the three HapMap ethnic groups, African, European, and Asian; our Korean population showed the highest degree of overlap with the Asian population, as expected. However, the overlap was less than 40%, implying that more CNVs remain to be discovered from the Asian population as well as from other populations. Genes in the novel CNVRs from the Korean population were enriched for genes involved in regulation and development processes.
CNVs are recently-recognized structural variations among individuals, and more CNVs need to be identified from diverse populations. Until now, CNVs from Asian populations have been studied less than those from European or American populations. In this regard, our study of CNVs from the Korean population will contribute to the full cataloguing of structural variation among diverse human populations.
拷贝数变异(CNV)是指从1 kb到亚微观大小的缺失、插入、重复以及更复杂的变异。阵列技术的最新进展使研究人员能够从正常个体中识别出一些CNV。然而,新CNV的识别尚未达到饱和,来自不同人群的更多CNV仍有待发现。
通过分析Affymetrix 250 K Nsp全基因组SNP数据,我们在116名正常韩国个体中识别出65个拷贝数变异区域(CNVR)。其中10个CNVR是新的,在基因组变异数据库(DGV)中不存在。为了提高CNV检测的特异性,将三种算法CNAG、dChip和GEMCA应用于数据集,只有那些至少被两种算法识别的区域才被确定为CNV。在韩国人群中识别出的大多数CNVR是罕见的(<1%),在116名个体中仅出现一次。当将韩国人群的CNV与三个HapMap族群(非洲、欧洲和亚洲)的CNV进行比较时;正如预期的那样,我们的韩国人群与亚洲人群的重叠程度最高。然而,重叠率不到40%,这意味着在亚洲人群以及其他人群中仍有更多的CNV有待发现。韩国人群新CNVR中的基因在参与调控和发育过程的基因中富集。
CNV是最近才被认识到的个体间结构变异,需要从不同人群中识别出更多的CNV。到目前为止,对亚洲人群CNV的研究比对欧洲或美国人群的研究要少。在这方面,我们对韩国人群CNV的研究将有助于全面编目不同人类群体之间的结构变异。