Ted Rogers Centre for Heart Research, Toronto, Canada; University of Toronto, Toronto, Canada; ICES (formerly Institute for Clinical Evaluative Sciences), Toronto, Canada.
University of Toronto, Toronto, Canada; Peter Munk Cardiac Centre of University Health Network, Toronto, Canada.
Am Heart J. 2024 Nov;277:93-103. doi: 10.1016/j.ahj.2024.07.017. Epub 2024 Jul 31.
Developing accurate models for predicting the risk of 30-day readmission is a major healthcare interest. Evidence suggests that models developed using machine learning (ML) may have better discrimination than conventional statistical models (CSM), but the calibration of such models is unclear.
To compare models developed using ML with those developed using CSM to predict 30-day readmission for cardiovascular and noncardiovascular causes in HF patients.
We retrospectively enrolled 10,919 patients with HF (> 18 years) discharged alive from a hospital or emergency department (2004-2007) in Ontario, Canada. The study sample was randomly divided into training and validation sets in a 2:1 ratio. CSMs to predict 30-day readmission were developed using Fine-Gray subdistribution hazards regression (treating death as a competing risk), and the ML algorithm employed random survival forests for competing risks (RSF-CR). Models were evaluated in the validation set using both discrimination and calibration metrics.
In the validation sample of 3602 patients, RSF-CR (c-statistic=0.620) showed similar discrimination to the Fine-Gray competing risk model (c-statistic=0.621) for 30-day cardiovascular readmission. In contrast, for 30-day noncardiovascular readmission, the Fine-Gray model (c-statistic=0.641) slightly outperformed the RSF-CR model (c-statistic=0.632). For both outcomes, The Fine-Gray model displayed better calibration than RSF-CR using calibration plots of observed vs predicted risks across the deciles of predicted risk.
Fine-Gray models had similar discrimination but superior calibration to the RSF-CR model, highlighting the importance of reporting calibration metrics for ML-based prediction models. The discrimination was modest in all readmission prediction models regardless of the methods used.
开发准确预测 30 天再入院风险的模型是医疗保健的主要关注点。有证据表明,使用机器学习(ML)开发的模型可能比传统的统计模型(CSM)具有更好的判别能力,但这些模型的校准情况尚不清楚。
比较使用 ML 和 CSM 开发的模型来预测 HF 患者因心血管和非心血管原因导致的 30 天再入院。
我们回顾性纳入了加拿大安大略省 10919 名(年龄>18 岁)从医院或急诊部出院的 HF 患者(2004-2007 年)。研究样本以 2:1 的比例随机分为训练集和验证集。使用 Fine-Gray 亚分布风险回归(将死亡视为竞争风险)开发 CSM 以预测 30 天再入院,而竞争风险的 ML 算法则使用随机生存森林(RSF-CR)。使用判别和校准指标在验证集中评估模型。
在 3602 例患者的验证样本中,RSF-CR(c 统计量=0.620)对 30 天心血管再入院的判别能力与 Fine-Gray 竞争风险模型(c 统计量=0.621)相似。相比之下,对于 30 天非心血管再入院,Fine-Gray 模型(c 统计量=0.641)略优于 RSF-CR 模型(c 统计量=0.632)。对于这两种结果,通过观察到的风险与预测风险的十分位数之间的预测风险的校准图,Fine-Gray 模型显示出比 RSF-CR 模型更好的校准。
Fine-Gray 模型的判别能力与 RSF-CR 模型相似,但校准能力优于 RSF-CR 模型,这突出了报告基于 ML 的预测模型的校准指标的重要性。无论使用何种方法,所有再入院预测模型的判别能力都较低。