Soave David M, Strug Lisa J
Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, ON, Canada.
Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
Front Genet. 2018 May 22;9:177. doi: 10.3389/fgene.2018.00177. eCollection 2018.
Risk prediction models can translate genetic association findings for clinical decision-making. Most models are evaluated on their ability to discriminate, and the calibration of risk-prediction models is largely overlooked in applications. Models that demonstrate good discrimination in training datasets, if not properly calibrated to produce unbiased estimates of risk, can perform poorly in new patient populations. Poorly calibrated models arise due to missing covariates, such as genetic interactions that may be unknown or not measured. We demonstrate that models omitting interactions can lead to increased bias in predicted risk for patients at the tails of the risk distribution; i.e., those patients who are most likely to be affected by clinical decision making. We propose a new calibration test for Cox risk-prediction models that aggregates martingale residuals for subjects from extreme high and low risk groups with a test statistic maximum chosen by varying which risk groups are included in the extremes. To estimate the empirical significance of our test statistic, we simulate from a Gaussian distribution using the covariance matrix for the grouped sums of martingale residuals. Simulation shows the new test maintains control of type 1 error with improved power over a conventional goodness-of-fit test when risk prediction deviates at the tails of the risk distribution. We apply our method in the development of a prediction model for risk of cystic fibrosis-related diabetes. Our study highlights the importance of assessing calibration and discrimination in predictive modeling, and provides a complementary tool in the assessment of risk model calibration.
风险预测模型能够将基因关联研究结果转化为临床决策依据。大多数模型是根据其区分能力进行评估的,而在实际应用中,风险预测模型的校准在很大程度上被忽视了。在训练数据集中表现出良好区分能力的模型,如果没有经过适当校准以产生无偏风险估计,在新的患者群体中可能表现不佳。校准不佳的模型是由于协变量缺失导致的,比如可能未知或未测量的基因相互作用。我们证明,忽略相互作用的模型会导致风险分布尾部患者预测风险的偏差增加;也就是说,那些最有可能受到临床决策影响的患者。我们为Cox风险预测模型提出了一种新的校准检验方法,该方法通过聚合极端高风险组和低风险组受试者的鞅残差来进行检验,检验统计量的最大值通过改变极端组中包含哪些风险组来选择。为了估计我们检验统计量的经验显著性,我们使用鞅残差分组和的协方差矩阵从高斯分布进行模拟。模拟结果表明,当风险预测在风险分布尾部出现偏差时,新检验方法能够控制第一类错误,且比传统的拟合优度检验具有更高的检验效能。我们将我们的方法应用于囊性纤维化相关糖尿病风险预测模型的开发中。我们的研究强调了在预测建模中评估校准和区分能力的重要性,并为风险模型校准评估提供了一种补充工具。