Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore.
BMC Genet. 2010 May 7;11:36. doi: 10.1186/1471-2156-11-36.
The International Hapmap project serves as a valuable resource for human genome variation data, however its applicability to other populations has yet to be exhaustively investigated. In this paper, we use high density genotyping chips and resequencing strategies to compare the Singapore Chinese population with the Hapmap populations. First we compared 1028 and 114 unrelated Singapore Chinese samples genotyped using the Illumina Human Hapmap 550 k chip and Affymetrix 500 k array respectively against the 270 samples from Hapmap. Secondly, data from 20 candidate genes on 5q31-33 resequenced for an asthma candidate gene based study was also used for the analysis.
A total of 237 SNPs were identified through resequencing of which only 95 SNPs (40%) were in Hapmap; however an additional 56 SNPs (24%) were not genotyped directly but had a proxy SNP in the Hapmap. At the genome-wide level, Singapore Chinese were highly correlated with Hapmap Han Chinese with correlation of 0.954 and 0.947 for the Illumina and Affymetrix platforms respectively with deviant SNPs randomly distributed within and across all chromosomes.
The high correlation between our population and Hapmap Han Chinese reaffirms the applicability of Hapmap based genome-wide chips for GWA studies. There is a clear population signature for the Singapore Chinese samples and they predominantly resemble the southern Han Chinese population; however when new migrants particularly those with northern Han Chinese background were included, population stratification issues may arise. Future studies needs to address population stratification within the sample collection while designing and interpreting GWAS in the Chinese population.
国际 Hapmap 计划是人类基因组变异数据的宝贵资源,但其在其他人群中的适用性尚未得到充分研究。在本文中,我们使用高密度基因分型芯片和重测序策略,将新加坡华人与 Hapmap 人群进行比较。首先,我们比较了使用 Illumina Human Hapmap 550 k 芯片和 Affymetrix 500 k 芯片分别对 1028 名和 114 名无亲缘关系的新加坡华人样本进行基因分型的结果与 Hapmap 中的 270 个样本。其次,还对基于哮喘候选基因的研究对 5q31-33 上 20 个候选基因的测序数据进行了分析。
通过重测序共鉴定出 237 个 SNP,其中只有 95 个 SNP(40%)在 Hapmap 中;然而,还有 56 个 SNP(24%)没有直接进行基因分型,但在 Hapmap 中有一个替代 SNP。在全基因组水平上,新加坡华人与 Hapmap 汉族高度相关,Illumina 和 Affymetrix 平台的相关系数分别为 0.954 和 0.947,偏离 SNP 随机分布在所有染色体的内部和之间。
我们的人群与 Hapmap 汉族的高度相关性再次证实了基于 Hapmap 的全基因组芯片在 GWA 研究中的适用性。新加坡华人样本具有明显的人群特征,主要与南方汉族人群相似;然而,当包括新移民,特别是具有北方汉族背景的新移民时,可能会出现群体分层问题。未来的研究需要在设计和解释中国人群的 GWAS 时,解决样本采集过程中的群体分层问题。