School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.
NIHR Biomedical Research Centre, Guy's and St Thomas' NHS Foundation Trust and King's College London, London, UK.
BMC Neurol. 2022 May 27;22(1):195. doi: 10.1186/s12883-022-02722-1.
We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care.
Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves.
In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068-0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891-0.900)) which outperformed LR reference model by 0.04 AUC (p < 0.001) and LR with elastic net and interaction term model by 0.003 AUC (p < 0.001). All models were perfectly calibrated for low (< 5%) and moderate risk groups (5-15%) and ≈1% underestimation for high-risk groups (> 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis.
All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others.
我们旨在开发和验证用于 30 天卒中死亡率的机器学习 (ML) 模型,以进行死亡率风险分层,并作为卒中护理质量改进的基准模型。
使用了 2013 年至 2019 年英国 Sentinel Stroke 国家审计计划的数据。使用 XGBoost、逻辑回归 (LR)、具有/不具有交互项的弹性网络逻辑回归,从 2013 年至 2018 年随机选择 80%的入院患者建立模型,对剩余的 20%入院患者进行验证,并对 2019 年的入院患者进行时间验证。模型使用 30 个变量进行开发。使用 LR 和 4 个变量开发参考模型。通过判别能力、校准度、再分类、Brier 评分和决策曲线评估所有模型的性能。
共纳入 488497 例卒中患者,30 天死亡率为 12.3%。在 2019 年的时间验证集,XGBoost 模型获得了最低的 Brier 评分(0.069(95%置信区间:0.068-0.071))和最高的 AUC(0.895(95%置信区间:0.891-0.900)),优于 LR 参考模型 0.04 AUC(p<0.001)和 LR 与弹性网络和交互项模型 0.003 AUC(p<0.001)。所有模型在低(<5%)和中危组(5-15%)的校准效果完美,在高危组(>15%)的校准效果略低。XGBoost 模型将 LR 参考模型的 1648(8.1%)例低危病例重新分类为中危或高危病例,并在决策曲线分析中获得最大的净获益。
所有包含 30 个变量的模型都可作为卒中护理质量改进的基准模型,具有一定的潜在应用价值,ML 模型的性能略优于其他模型。