Daetwyler Hans D, Villanueva Beatriz, Woolliams John A
Genetics and Genomics, The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Roslin, Midlothian, United Kingdom.
PLoS One. 2008;3(10):e3395. doi: 10.1371/journal.pone.0003395. Epub 2008 Oct 14.
The prediction of the genetic disease risk of an individual is a powerful public health tool. While predicting risk has been successful in diseases which follow simple Mendelian inheritance, it has proven challenging in complex diseases for which a large number of loci contribute to the genetic variance. The large numbers of single nucleotide polymorphisms now available provide new opportunities for predicting genetic risk of complex diseases with high accuracy.
METHODOLOGY/PRINCIPAL FINDINGS: We have derived simple deterministic formulae to predict the accuracy of predicted genetic risk from population or case control studies using a genome-wide approach and assuming a dichotomous disease phenotype with an underlying continuous liability. We show that the prediction equations are special cases of the more general problem of predicting the accuracy of estimates of genetic values of a continuous phenotype. Our predictive equations are responsive to all parameters that affect accuracy and they are independent of allele frequency and effect distributions. Deterministic prediction errors when tested by simulation were generally small. The common link among the expressions for accuracy is that they are best summarized as the product of the ratio of number of phenotypic records per number of risk loci and the observed heritability.
CONCLUSIONS/SIGNIFICANCE: This study advances the understanding of the relative power of case control and population studies of disease. The predictions represent an upper bound of accuracy which may be achievable with improved effect estimation methods. The formulae derived will help researchers determine an appropriate sample size to attain a certain accuracy when predicting genetic risk.
预测个体的遗传疾病风险是一项强有力的公共卫生工具。虽然在遵循简单孟德尔遗传的疾病中预测风险已取得成功,但在大量基因座对遗传变异有贡献的复杂疾病中,这已被证明具有挑战性。现在可用的大量单核苷酸多态性为高精度预测复杂疾病的遗传风险提供了新机会。
方法/主要发现:我们推导了简单的确定性公式,以使用全基因组方法并假设具有潜在连续易感性的二分疾病表型,从人群或病例对照研究中预测预测遗传风险的准确性。我们表明,预测方程是预测连续表型遗传值估计准确性这一更一般问题的特殊情况。我们的预测方程对影响准确性的所有参数都有响应,并且它们与等位基因频率和效应分布无关。通过模拟测试时,确定性预测误差通常较小。准确性表达式之间的共同联系是,它们最好总结为每个风险基因座的表型记录数与观察到的遗传力之比的乘积。
结论/意义:本研究推进了对疾病病例对照和人群研究相对效力的理解。这些预测代表了通过改进效应估计方法可能实现的准确性上限。推导的公式将帮助研究人员在预测遗传风险时确定适当的样本量以达到一定的准确性。