机器学习技术在心脏手术后死亡率预测中的比较：来自大型国家数据库的 22 万多例患者的分析。

Comparison of machine learning techniques in prediction of mortality following cardiac surgery: analysis of over 220 000 patients from a large national database.

机构信息

Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK.

Alan Turing Institute, London, UK.

出版信息

Eur J Cardiothorac Surg. 2023 Jun 1;63(6). doi: 10.1093/ejcts/ezad183.

DOI:10.1093/ejcts/ezad183

PMID:37154705

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10275911/

Abstract

OBJECTIVES

To perform a systematic comparison of in-hospital mortality risk prediction post-cardiac surgery, between the predominant scoring system-European System for Cardiac Operative Risk Evaluation (EuroSCORE) II, logistic regression (LR) retrained on the same variables and alternative machine learning techniques (ML)-random forest (RF), neural networks (NN), XGBoost and weighted support vector machine.

METHODS

Retrospective analyses of prospectively routinely collected data on adult patients undergoing cardiac surgery in the UK from January 2012 to March 2019. Data were temporally split 70:30 into training and validation subsets. Mortality prediction models were created using the 18 variables of EuroSCORE II. Comparisons of discrimination, calibration and clinical utility were then conducted. Changes in model performance, variable-importance over time and hospital/operation-based model performance were also reviewed.

RESULTS

Of the 227 087 adults who underwent cardiac surgery during the study period, there were 6258 deaths (2.76%). In the testing cohort, there was an improvement in discrimination [XGBoost (95% confidence interval (CI) area under the receiver operator curve (AUC), 0.834-0.834, F1 score, 0.276-0.280) and RF (95% CI AUC, 0.833-0.834, F1, 0.277-0.281)] compared with EuroSCORE II (95% CI AUC, 0.817-0.818, F1, 0.243-0.245). There was no significant improvement in calibration with ML and retrained-LR compared to EuroSCORE II. However, EuroSCORE II overestimated risk across all deciles of risk and over time. The calibration drift was lowest in NN, XGBoost and RF compared with EuroSCORE II. Decision curve analysis showed XGBoost and RF to have greater net benefit than EuroSCORE II.

CONCLUSIONS

ML techniques showed some statistical improvements over retrained-LR and EuroSCORE II. The clinical impact of this improvement is modest at present. However the incorporation of additional risk factors in future studies may improve upon these findings and warrants further study.

摘要

目的

系统比较心脏手术后院内死亡率风险预测，比较主要评分系统——欧洲心脏手术风险评估系统（EuroSCORE）II、基于相同变量重新训练的逻辑回归（LR）和替代机器学习技术（ML）——随机森林（RF）、神经网络（NN）、XGBoost 和加权支持向量机。

方法

回顾性分析 2012 年 1 月至 2019 年 3 月英国接受心脏手术的成年患者的前瞻性常规收集数据。数据按 70:30 的比例分为训练集和验证集。使用 EuroSCORE II 的 18 个变量创建死亡率预测模型。然后进行区分度、校准度和临床实用性的比较。还回顾了模型性能、变量重要性随时间的变化以及基于医院/手术的模型性能的变化。

结果

在研究期间，227087 名成年人接受了心脏手术，其中有 6258 人死亡（2.76%）。在测试队列中，与 EuroSCORE II 相比，区分度有所提高[XGBoost（95%置信区间（CI）下的接收者操作特征曲线（AUC）面积，0.834-0.834，F1 评分，0.276-0.280）和 RF（95% CI AUC，0.833-0.834，F1，0.277-0.281）]。与 EuroSCORE II 相比，ML 和重新训练的 LR 在校准方面没有显著提高。然而，与 ML 和重新训练的 LR 相比，EuroSCORE II 在所有风险十分位数和随时间推移的风险都存在高估。与 EuroSCORE II 相比，NN、XGBoost 和 RF 的校准漂移最低。决策曲线分析显示，与 EuroSCORE II 相比，XGBoost 和 RF 具有更大的净收益。

结论

ML 技术在重新训练的 LR 和 EuroSCORE II 上显示出一些统计上的改进。目前，这种改进的临床影响是适度的。然而，在未来的研究中纳入更多的风险因素可能会改善这些发现，并值得进一步研究。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

机器学习技术在心脏手术后死亡率预测中的比较：来自大型国家数据库的 22 万多例患者的分析。

Comparison of machine learning techniques in prediction of mortality following cardiac surgery: analysis of over 220 000 patients from a large national database.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

机器学习技术在心脏手术后死亡率预测中的比较：来自大型国家数据库的 22 万多例患者的分析。

Comparison of machine learning techniques in prediction of mortality following cardiac surgery: analysis of over 220 000 patients from a large national database.

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献