Department of Clinical Genetics, Section Community Genetics, Amsterdam Public Health Research Institute, VU University Medical Center, Amsterdam, The Netherlands.
Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA.
Genet Med. 2019 Feb;21(2):391-397. doi: 10.1038/s41436-018-0058-9. Epub 2018 Jun 12.
The area under the receiver operating characteristic curve (AUC) is commonly used for evaluating the improvement of polygenic risk models and increasingly assessed together with the net reclassification improvement (NRI) and integrated discrimination improvement (IDI). We evaluated how researchers described and interpreted AUC, NRI, and IDI when simultaneously assessed.
We reviewed how researchers described definitions of AUC, NRI, and IDI and how they computed each metric. Next, we reviewed how the increment in AUC, NRI, and IDI were interpreted, and how the overall conclusion about the improvement of the risk model was reached.
AUC, NRI, and IDI were correctly defined in 63, 70, and 0% of the articles. All statistically significant values and almost half of the nonsignificant were interpreted as indicative of improvement, irrespective of the values of the metrics. Also, small, nonsignificant changes in the AUC were interpreted as indication of improvement when NRI and IDI were statistically significant.
Researchers have insufficient knowledge about how to interpret the various metrics for the assessment of the predictive performance of polygenic risk models and rely on the statistical significance for their interpretation. A better understanding is needed to achieve more meaningful interpretation of polygenic prediction studies.
受试者工作特征曲线下面积(AUC)常用于评估多基因风险模型的改善,并且越来越多地与净重新分类改善(NRI)和综合判别改善(IDI)一起评估。我们评估了当同时评估时,研究人员如何描述和解释 AUC、NRI 和 IDI。
我们回顾了研究人员如何描述 AUC、NRI 和 IDI 的定义以及如何计算每个指标。接下来,我们回顾了 AUC、NRI 和 IDI 的增量是如何解释的,以及如何得出风险模型改进的总体结论。
在 63%、70%和 0%的文章中,AUC、NRI 和 IDI 的定义是正确的。所有具有统计学意义的值和近一半的无统计学意义的值都被解释为改善的指示,而不管指标的值如何。此外,当 NRI 和 IDI 具有统计学意义时,AUC 中较小的、无统计学意义的变化也被解释为改善的迹象。
研究人员对如何解释用于评估多基因风险模型预测性能的各种指标知之甚少,并且依赖于统计显著性来进行解释。需要更好地理解,以实现对多基因预测研究更有意义的解释。