Department of Thoracic Surgery, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 650 Xinsongjiang Road, Shanghai, 201620, China.
Department of Cardiovascular Surgery, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, No. 85 Wujin Road, Shanghai, 200080, China.
J Transl Med. 2024 Aug 15;22(1):772. doi: 10.1186/s12967-024-05395-1.
Acute respiratory distress syndrome (ARDS) after cardiac surgery is a severe respiratory complication with high mortality and morbidity. Traditional clinical approaches may lead to under recognition of this heterogeneous syndrome, potentially resulting in diagnosis delay. This study aims to develop and external validate seven machine learning (ML) models, trained on electronic health records data, for predicting ARDS after cardiac surgery.
This multicenter, observational cohort study included patients who underwent cardiac surgery in the training and testing cohorts (data from Nanjing First Hospital), as well as those patients who had cardiac surgery in a validation cohort (data from Shanghai General Hospital). The number of important features was determined using the sliding windows sequential forward feature selection method (SWSFS). We developed a set of tree-based ML models, including Decision Tree, GBDT, AdaBoost, XGBoost, LightGBM, Random Forest, and Deep Forest. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and Brier score. The SHapley Additive exPlanation (SHAP) techinque was employed to interpret the ML model. Furthermore, a comparison was made between the ML models and traditional scoring systems. ARDS is defined according to the Berlin definition.
A total of 1996 patients who had cardiac surgery were included in the study. The top five important features identified by the SWSFS were chronic obstructive pulmonary disease, preoperative albumin, central venous pressure_T4, cardiopulmonary bypass time, and left ventricular ejection fraction. Among the seven ML models, Deep Forest demonstrated the best performance, with an AUC of 0.882 and a Brier score of 0.809 in the validation cohort. Notably, the SHAP values effectively illustrated the contribution of the 13 features attributed to the model output and the individual feature's effect on model prediction. In addition, the ensemble ML models demonstrated better performance than the other six traditional scoring systems.
Our study identified 13 important features and provided multiple ML models to enhance the risk stratification for ARDS after cardiac surgery. Using these predictors and ML models might provide a basis for early diagnostic and preventive strategies in the perioperative management of ARDS patients.
心脏手术后急性呼吸窘迫综合征(ARDS)是一种严重的呼吸系统并发症,具有高死亡率和高发病率。传统的临床方法可能导致对这种异质综合征的识别不足,从而导致诊断延迟。本研究旨在开发和外部验证七种基于电子健康记录数据训练的机器学习(ML)模型,用于预测心脏手术后的 ARDS。
这项多中心、观察性队列研究包括在培训和测试队列(来自南京第一医院的数据)中接受心脏手术的患者,以及在验证队列(来自上海总医院的数据)中接受心脏手术的患者。使用滑动窗口序贯前向特征选择方法(SWSFS)确定重要特征的数量。我们开发了一组基于树的 ML 模型,包括决策树、GBDT、AdaBoost、XGBoost、LightGBM、随机森林和深度森林。使用接收者操作特征曲线下面积(AUC)和 Brier 评分评估模型性能。使用 SHapley Additive exPlanation(SHAP)技术解释 ML 模型。此外,还比较了 ML 模型和传统评分系统之间的差异。ARDS 根据柏林定义进行定义。
共有 1996 名接受心脏手术的患者纳入研究。通过 SWSFS 确定的前五个重要特征为慢性阻塞性肺疾病、术前白蛋白、中心静脉压_T4、体外循环时间和左心室射血分数。在七种 ML 模型中,深度森林在验证队列中的表现最佳,AUC 为 0.882,Brier 得分为 0.809。值得注意的是,SHAP 值有效地说明了 13 个特征对模型输出的贡献以及单个特征对模型预测的影响。此外,集成 ML 模型的性能优于其他六种传统评分系统。
本研究确定了 13 个重要特征,并提供了多种 ML 模型来增强心脏手术后 ARDS 的风险分层。使用这些预测因子和 ML 模型可能为 ARDS 患者围手术期管理中的早期诊断和预防策略提供依据。