Huang Qiqing, Fu Yun-Xin, Boerwinkle Eric
Human Genetics Center, University of Texas-Houston Health Science Center, 1200 Herman Pressler, Suite 453E, Houston, TX 77030, USA.
Hum Genet. 2003 Aug;113(3):253-7. doi: 10.1007/s00439-003-0965-x. Epub 2003 Jun 17.
It is widely believed that a subset of single nucleotide polymorphisms (SNPs) is able to capture the majority of the information for genotype-phenotype association studies that is contained in the complete compliment of genetic variations. The question remains, how does one select that particular subset of SNPs in order to maximize the power of detecting a significant association? In this study, we have used a simulation approach to compare three competing methods of site selection: random selection, selection based on pair-wise linkage disequilibrium, and selection based on maximizing haplotype diversity. The results indicate that site selection based on maximizing haplotype diversity is preferred over random selection and selection based on pair-wise linkage disequilibrium. The results also indicate that it is more prudent to increase the sample size to improve a study's power than to continuously increase the number of SNPs. These results have direct implications for designing gene-based and genome-wide association studies.
人们普遍认为,单核苷酸多态性(SNP)的一个子集能够获取基因分型-表型关联研究中大部分的信息,而这些信息包含在完整的遗传变异集合中。问题仍然存在,如何选择该特定的SNP子集,以便最大限度地提高检测显著关联的能力?在本研究中,我们使用了一种模拟方法来比较三种相互竞争的位点选择方法:随机选择、基于成对连锁不平衡的选择以及基于最大化单倍型多样性的选择。结果表明,基于最大化单倍型多样性的位点选择优于随机选择和基于成对连锁不平衡的选择。结果还表明,增加样本量以提高研究效能比持续增加SNP的数量更为审慎。这些结果对设计基于基因和全基因组关联研究具有直接影响。