Nishida Nao, Koike Asako, Tajima Atsushi, Ogasawara Yuko, Ishibashi Yoshimi, Uehara Yasuka, Inoue Ituro, Tokunaga Katsushi
Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
BMC Genomics. 2008 Sep 22;9:431. doi: 10.1186/1471-2164-9-431.
With improvements in genotyping technologies, genome-wide association studies with hundreds of thousands of SNPs allow the identification of candidate genetic loci for multifactorial diseases in different populations. However, genotyping errors caused by genotyping platforms or genotype calling algorithms may lead to inflation of false associations between markers and phenotypes. In addition, the number of SNPs available for genome-wide association studies in the Japanese population has been investigated using only 45 samples in the HapMap project, which could lead to an inaccurate estimation of the number of SNPs with low minor allele frequencies. We genotyped 400 Japanese samples in order to estimate the number of SNPs available for genome-wide association studies in the Japanese population and to examine the performance of the current SNP Array 6.0 platform and the genotype calling algorithm "Birdseed".
About 20% of the 909,622 SNP markers on the array were revealed to be monomorphic in the Japanese population. Consequently, 661,599 SNPs were available for genome-wide association studies in the Japanese population, after excluding the poorly behaving SNPs. The Birdseed algorithm accurately determined the genotype calls of each sample with a high overall call rate of over 99.5% and a high concordance rate of over 99.8% using more than 48 samples after removing low-quality samples by adjusting QC criteria.
Our results confirmed that the SNP Array 6.0 platform reached the level reported by the manufacturer, and thus genome-wide association studies using the SNP Array 6.0 platform have considerable potential to identify candidate susceptibility or resistance genetic factors for multifactorial diseases in the Japanese population, as well as in other populations.
随着基因分型技术的进步,对数十万单核苷酸多态性(SNP)进行的全基因组关联研究能够在不同人群中识别多因素疾病的候选基因位点。然而,由基因分型平台或基因型判读算法导致的基因分型错误可能会使标记与表型之间的假关联出现膨胀。此外,在国际人类基因组单体型图计划(HapMap计划)中,仅使用45个样本对日本人群中可用于全基因组关联研究的SNP数量进行了调查,这可能导致对低频次要等位基因SNP数量的估计不准确。我们对400名日本样本进行了基因分型,以估计日本人群中可用于全基因组关联研究的SNP数量,并检验当前SNP Array 6.0平台和基因型判读算法“Birdseed”的性能。
该芯片上909,622个SNP标记中约20%在日本人群中显示为单态性。因此,在排除表现不佳的SNP后,661,599个SNP可用于日本人群的全基因组关联研究。通过调整质量控制标准去除低质量样本后,使用超过48个样本,Birdseed算法以超过99.5%的高总体判读率和超过99.8%的高一致性率准确地确定了每个样本的基因型判读。
我们的结果证实SNP Array 6.0平台达到了制造商报告的水平,因此使用SNP Array 6.0平台进行全基因组关联研究在识别日本人群以及其他人群中多因素疾病的候选易感性或抗性遗传因素方面具有相当大的潜力。