Hinds David A, Stokowski Renee P, Patil Nila, Konvicka Karel, Kershenobich David, Cox David R, Ballinger Dennis G
Perlegen Sciences, Mountain View, CA, 94043, USA.
Am J Hum Genet. 2004 Feb;74(2):317-25. doi: 10.1086/381716. Epub 2004 Jan 21.
Association studies in populations that are genetically heterogeneous can yield large numbers of spurious associations if population subgroups are unequally represented among cases and controls. This problem is particularly acute for studies involving pooled genotyping of very large numbers of single-nucleotide-polymorphism (SNP) markers, because most methods for analysis of association in structured populations require individual genotyping data. In this study, we present several strategies for matching case and control pools to have similar genetic compositions, based on ancestry information inferred from genotype data for approximately 300 SNPs tiled on an oligonucleotide-based genotyping array. We also discuss methods for measuring the impact of population stratification on an association study. Results for an admixed population and a phenotype strongly confounded with ancestry show that these simple matching strategies can effectively mitigate the impact of population stratification.
在基因异质性人群中进行关联研究时,如果病例组和对照组中人群亚组的代表性不均衡,可能会产生大量虚假关联。对于涉及对大量单核苷酸多态性(SNP)标记进行混合基因分型的研究,这个问题尤为严重,因为大多数分析结构化人群中关联性的方法都需要个体基因分型数据。在本研究中,我们基于从在基于寡核苷酸的基因分型阵列上平铺的约300个SNP的基因型数据推断出的祖先信息,提出了几种使病例组和对照组样本库具有相似基因组成的策略。我们还讨论了测量人群分层对关联研究影响的方法。对于一个混合人群和一个与祖先强烈混淆的表型的研究结果表明,这些简单的匹配策略可以有效减轻人群分层的影响。