Department of Epidemiology, Erasmus University Medical Center, Rotterdam, The Netherlands.
Eur J Hum Genet. 2012 Dec;20(12):1270-4. doi: 10.1038/ejhg.2012.89. Epub 2012 May 30.
Various modeling methods have been proposed to estimate the potential predictive ability of polygenic risk variants that predispose to various common diseases. However, it is unknown whether differences between them affect their conclusions on predictive ability. We reviewed input parameters, assumptions and output of the five most common methods and compared their estimates of the area under the receiver operating characteristic (ROC) curve (AUC) using hypothetical data representing effect sizes and frequencies of genetic variants, population disease risk and number of variants. To assess the accuracy of the estimated AUCs, we aimed to reproduce the AUCs of published empirical studies. All methods assumed that the combined effect of genetic variants on disease risk followed a multiplicative risk model of independent genetic effects, but they either assumed per allele, per genotype or dominant/recessive effects for the genetic variants. Modeling strategy and input parameters differed. Methods used simulation analysis or analytical formulas with effect sizes quantified by odds ratios (ORs) or relative risks. Estimated AUC values were similar for lower ORs (<1.2). When AUCs were larger (>0.7) due to variants with strong effects, differences in estimated AUCs between methods increased. The simulation methods accurately reproduced the AUC values of empirical studies, but the analytical methods did not. We conclude that despite differences in input parameters, the modeling methods estimate similar AUC for realistic values of the ORs. When one or more variants have stronger effects and AUC values are higher, the simulation methods tend to be more accurate.
已经提出了各种建模方法来估计导致各种常见疾病的多基因风险变异的潜在预测能力。然而,尚不清楚它们之间的差异是否会影响它们对预测能力的结论。我们回顾了五种最常用方法的输入参数、假设和输出,并使用代表遗传变异、人群疾病风险和变异数量的效应大小和频率的假设数据比较了它们对接收者操作特征 (ROC) 曲线 (AUC) 下面积的估计值。为了评估估计 AUC 的准确性,我们旨在重现已发表的经验研究的 AUC。所有方法都假设遗传变异对疾病风险的综合影响遵循独立遗传效应的乘法风险模型,但它们要么假设遗传变异的每个等位基因、每个基因型或显性/隐性效应,要么假设遗传变异的每个等位基因、每个基因型或显性/隐性效应。建模策略和输入参数不同。方法使用模拟分析或带有由比值比 (OR) 或相对风险量化的效应大小的分析公式。对于较低的 OR(<1.2),估计的 AUC 值相似。当由于具有强效应的变异导致 AUC 较大(>0.7)时,方法之间估计的 AUC 差异增加。模拟方法准确再现了经验研究的 AUC 值,但分析方法没有。我们的结论是,尽管输入参数存在差异,但建模方法对 OR 的实际值估计相似的 AUC。当一个或多个变异具有更强的效应且 AUC 值更高时,模拟方法往往更准确。