基于全国登记注册队列研究的机器学习预测卒中后 30 天死亡率。

Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study.

机构信息

School of Population Health & Environmental Sciences, Faculty of Life Science and Medicine, King's College London, London, UK.

NIHR Biomedical Research Centre, Guy's and St Thomas' NHS Foundation Trust and King's College London, London, UK.

出版信息

BMC Neurol. 2022 May 27;22(1):195. doi: 10.1186/s12883-022-02722-1.

DOI:10.1186/s12883-022-02722-1

PMID:35624434

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9137068/

Abstract

BACKGROUNDS

We aimed to develop and validate machine learning (ML) models for 30-day stroke mortality for mortality risk stratification and as benchmarking models for quality improvement in stroke care.

METHODS

Data from the UK Sentinel Stroke National Audit Program between 2013 to 2019 were used. Models were developed using XGBoost, Logistic Regression (LR), LR with elastic net with/without interaction terms using 80% randomly selected admissions from 2013 to 2018, validated on the 20% remaining admissions, and temporally validated on 2019 admissions. The models were developed with 30 variables. A reference model was developed using LR and 4 variables. Performances of all models was evaluated in terms of discrimination, calibration, reclassification, Brier scores and Decision-curves.

RESULTS

In total, 488,497 stroke patients with a 12.3% 30-day mortality rate were included in the analysis. In 2019 temporal validation set, XGBoost model obtained the lowest Brier score (0.069 (95% CI: 0.068-0.071)) and the highest area under the ROC curve (AUC) (0.895 (95% CI: 0.891-0.900)) which outperformed LR reference model by 0.04 AUC (p < 0.001) and LR with elastic net and interaction term model by 0.003 AUC (p < 0.001). All models were perfectly calibrated for low (< 5%) and moderate risk groups (5-15%) and ≈1% underestimation for high-risk groups (> 15%). The XGBoost model reclassified 1648 (8.1%) low-risk cases by the LR reference model as being moderate or high-risk and gained the most net benefit in decision curve analysis.

CONCLUSIONS

All models with 30 variables are potentially useful as benchmarking models in stroke-care quality improvement with ML slightly outperforming others.

摘要

背景

我们旨在开发和验证用于 30 天卒中死亡率的机器学习 (ML) 模型，以进行死亡率风险分层，并作为卒中护理质量改进的基准模型。

方法

使用了 2013 年至 2019 年英国 Sentinel Stroke 国家审计计划的数据。使用 XGBoost、逻辑回归 (LR)、具有/不具有交互项的弹性网络逻辑回归，从 2013 年至 2018 年随机选择 80%的入院患者建立模型，对剩余的 20%入院患者进行验证，并对 2019 年的入院患者进行时间验证。模型使用 30 个变量进行开发。使用 LR 和 4 个变量开发参考模型。通过判别能力、校准度、再分类、Brier 评分和决策曲线评估所有模型的性能。

结果

共纳入 488497 例卒中患者，30 天死亡率为 12.3%。在 2019 年的时间验证集，XGBoost 模型获得了最低的 Brier 评分（0.069（95%置信区间：0.068-0.071））和最高的 AUC（0.895（95%置信区间：0.891-0.900）），优于 LR 参考模型 0.04 AUC（p<0.001）和 LR 与弹性网络和交互项模型 0.003 AUC（p<0.001）。所有模型在低（<5%）和中危组（5-15%）的校准效果完美，在高危组（>15%）的校准效果略低。XGBoost 模型将 LR 参考模型的 1648（8.1%）例低危病例重新分类为中危或高危病例，并在决策曲线分析中获得最大的净获益。

结论

所有包含 30 个变量的模型都可作为卒中护理质量改进的基准模型，具有一定的潜在应用价值，ML 模型的性能略优于其他模型。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于全国登记注册队列研究的机器学习预测卒中后 30 天死亡率。

Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study.

机构信息

出版信息

BACKGROUNDS

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

基于全国登记注册队列研究的机器学习预测卒中后 30 天死亡率。

Risk prediction of 30-day mortality after stroke using machine learning: a nationwide registry-based cohort study.

机构信息

出版信息

BACKGROUNDS

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献