Ohashi J, Tokunaga K
Department of Human Genetics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
Ann Hum Genet. 2002 Jul;66(Pt 4):297-306. doi: 10.1017/S0003480002001197.
The expected power of genome-wide linkage disequilibrium (LD) testing for a low-frequency disease variant was examined using a simple genetic model in which the degree of LD between the disease variant and the adjacent single nucleotide polymorphism (SNP) marker decreases in proportion to the number of generations since the LD-generating event. In this study, the frequency of the SNP marker being in complete LD with a low-frequency disease variant at the LD-generating event was regarded as the random variable having the probability distribution expected from the neutral infinite sites model, which enables us to derive the formula for calculating the expected power of genome-wide LD testing without determining the allele frequency of the associated SNP marker. Such a treatment is essential for the evaluation of the power of LD testing, because the frequency of the associated marker allele is always unknown. The main results obtained are as follows: (1) genome-wide LD testing with a case-control design could identify a disease variant with a high penetrance, while a low-frequency disease variant showing a low penetrance is difficult to detect; (2) although the degree of LD increases as the number of markers increases, the power of LD testing does not necessarily increase after the significance level is adjusted by the Sidák correction or the Bonferroni correction based on the number of testings; (3) the use of SNP markers with only high-frequency minor alleles is more powerful for detecting LD even with a low-frequency disease variant than the use of SNP markers with both high- and low-frequency minor alleles. Thus, the study design of LD testing must be evaluated prior to the investigation. The present study will provide a guideline for determining the number of SNP markers and the range of SNP allele frequencies suitable for genome-wide LD testing.
使用一个简单的遗传模型,研究了全基因组连锁不平衡(LD)检测低频疾病变异的预期效能。在该模型中,疾病变异与相邻单核苷酸多态性(SNP)标记之间的LD程度与产生LD的事件发生后的世代数成比例下降。在本研究中,将在产生LD的事件中与低频疾病变异处于完全LD状态的SNP标记的频率视为具有中性无限位点模型预期概率分布的随机变量,这使我们能够推导出计算全基因组LD检测预期效能的公式,而无需确定相关SNP标记的等位基因频率。这种处理对于评估LD检测的效能至关重要,因为相关标记等位基因的频率始终是未知的。获得的主要结果如下:(1)病例对照设计的全基因组LD检测可以识别具有高外显率的疾病变异,而具有低外显率的低频疾病变异则难以检测;(2)尽管随着标记数量的增加LD程度增加,但在根据检测次数通过西达克校正或邦费罗尼校正调整显著性水平后,LD检测的效能不一定会增加;(3)仅使用具有高频次要等位基因的SNP标记比使用同时具有高频和低频次要等位基因的SNP标记在检测低频疾病变异的LD方面更有效。因此,必须在研究之前评估LD检测的研究设计。本研究将为确定适合全基因组LD检测的SNP标记数量和SNP等位基因频率范围提供指导。