Siregar Sabrina, Nieboer Daan, Vergouwe Yvonne, Versteegh Michel I M, Noyez Luc, Vonk Alexander B A, Steyerberg Ewout W, Takkenberg Johanna J M
From the Department of Cardio-Thoracic Surgery, Leiden University Medical Center, Leiden, The Netherlands (S.S., M.I.M.V.); Department of Public Health, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands (D.N., Y.V., E.W.S.); Department of Cardio-Thoracic Surgery, Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands (L.N.); Department of Cardio-Thoracic Surgery, VU Medical Center, Amsterdam, The Netherlands (A.B.A.V.); and Department of Cardio-Thoracic Surgery, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands (J.J.M.T.).
Circ Cardiovasc Qual Outcomes. 2016 Mar;9(2):171-81. doi: 10.1161/CIRCOUTCOMES.114.001645. Epub 2016 Mar 1.
The predictive performance of static risk prediction models such as EuroSCORE deteriorates over time. We aimed to explore different methods for continuous updating of EuroSCORE (dynamic modeling) to improve risk prediction.
Data on adult cardiac surgery from 2007 to 2012 (n=95 240) were extracted from the Netherlands Association for Cardio-Thoracic Surgery database. The logistic EuroSCORE predicting in-hospital death was updated using 6 methods: recalibrating the intercept of the logistic regression model; recalibrating the intercept and joint effects of the prognostic factors; re-estimating all prognostic factor effects, re-estimating all prognostic factor effects, and applying shrinkage of the estimates; applying a test procedure to select either of these; and a Bayesian learning strategy. Models were updated with 1 or 3 years of data, in all cardiac surgery or within operation subgroups. Performance was tested in the subsequent year according to discrimination (area under the receiver operating curve, area under the curve) and calibration (calibration slope and calibration-in-the-large). Compared with the original EuroSCORE, all updating methods resulted in improved calibration-in-the-large (range -0.17 to 0.04 versus -1.13 to -0.97, ideally 0.0). Calibration slope (range 0.92-1.15) and discrimination (area under the curve range 0.83-0.87) were similar across methods. In small subgroups, such as aortic valve replacement and aortic valve replacement+coronary artery bypass grafting, extensive updating using 1 year of data led to poorer performance than using the original EuroSCORE. The choice of updating method had little effect on benchmarking results of all cardiac surgery.
Several methods for dynamic modeling may result in good discrimination and superior calibration compared with the original EuroSCORE. For large populations, all methods are appropriate. For smaller subgroups, it is recommended to use data from multiple years or a Bayesian approach.
诸如欧洲心脏手术风险评估系统(EuroSCORE)等静态风险预测模型的预测性能会随时间下降。我们旨在探索不同方法对EuroSCORE进行持续更新(动态建模)以改善风险预测。
从荷兰心胸外科协会数据库中提取了2007年至2012年成人心脏手术数据(n = 95240)。使用6种方法更新预测院内死亡的逻辑EuroSCORE:重新校准逻辑回归模型的截距;重新校准截距及预后因素的联合效应;重新估计所有预后因素效应、重新估计所有预后因素效应并应用估计值收缩;应用检验程序选择其中之一;以及贝叶斯学习策略。使用1年或3年数据在所有心脏手术或手术亚组内更新模型。次年根据辨别力(受试者工作特征曲线下面积,曲线下面积)和校准(校准斜率和整体校准)对性能进行测试。与原始EuroSCORE相比,所有更新方法均使整体校准得到改善(范围为-0.17至0.04,而原始为-1.13至-0.97,理想值为0.0)。各方法间校准斜率(范围为0.92 - 1.15)和辨别力(曲线下面积范围为0.83 - 0.87)相似。在诸如主动脉瓣置换术和主动脉瓣置换术 + 冠状动脉搭桥术等小亚组中,使用1年数据进行广泛更新导致的性能比使用原始EuroSCORE更差。更新方法的选择对所有心脏手术的基准测试结果影响不大。
与原始EuroSCORE相比几种动态建模方法可能会带来良好的辨别力和更优的校准。对于大群体,所有方法均适用。对于较小亚组,建议使用多年数据或贝叶斯方法。