Department of Pharmacy, Fujian Maternity and Child Health Hospital College of Clinical Medicine for Obstetrics and Gynecology and Pediatrics, Fujian Medical University, #18 Daoshan Road, Fuzhou, 350001, China.
Crit Care. 2023 Oct 24;27(1):406. doi: 10.1186/s13054-023-04683-4.
Venous thromboembolism (VTE) is a severe complication in critically ill patients, often resulting in death and long-term disability and is one of the major contributors to the global burden of disease. This study aimed to construct an interpretable machine learning (ML) model for predicting VTE in critically ill patients based on clinical features and laboratory indicators.
Data for this study were extracted from the eICU Collaborative Research Database (version 2.0). A stepwise logistic regression model was used to select the predictors that were eventually included in the model. The random forest, extreme gradient boosting (XGBoost) and support vector machine algorithms were used to construct the model using fivefold cross-validation. The area under curve (AUC), accuracy, no information rate, balanced accuracy, kappa, sensitivity, specificity, precision, and F1 score were used to assess the model's performance. In addition, the DALEX package was used to improve the interpretability of the final model.
This study ultimately included 109,044 patients, of which 1647 (1.5%) had VTE during ICU hospitalization. Among the three models, the Random Forest model (AUC: 0.9378; Accuracy: 0.9958; Kappa: 0.8371; Precision: 0.9095; F1 score: 0.8393; Sensitivity: 0.7791; Specificity: 0.9989) performed the best.
ML models can be a reliable tool for predicting VTE in critically ill patients. Among all the models we had constructed, the random forest model was the most effective model that helps the user identify patients at high risk of VTE early so that early intervention can be implemented to reduce the burden of VTE on the patients.
静脉血栓栓塞症(VTE)是危重症患者的严重并发症,常导致死亡和长期残疾,是全球疾病负担的主要因素之一。本研究旨在构建一种基于临床特征和实验室指标预测危重症患者 VTE 的可解释机器学习(ML)模型。
本研究的数据来自 eICU 协作研究数据库(版本 2.0)。采用逐步逻辑回归模型选择最终纳入模型的预测因子。使用五折交叉验证,随机森林、极端梯度提升(XGBoost)和支持向量机算法构建模型。使用曲线下面积(AUC)、准确性、无信息率、平衡准确性、kappa、敏感性、特异性、精度和 F1 评分评估模型性能。此外,使用 DALEX 包提高最终模型的可解释性。
本研究最终纳入 109044 例患者,其中 1647 例(1.5%)在 ICU 住院期间发生 VTE。在三种模型中,随机森林模型(AUC:0.9378;准确性:0.9958;kappa:0.8371;精度:0.9095;F1 评分:0.8393;敏感性:0.7791;特异性:0.9989)表现最佳。
ML 模型可作为预测危重症患者 VTE 的可靠工具。在我们构建的所有模型中,随机森林模型是最有效的模型,可帮助用户早期识别 VTE 风险较高的患者,以便实施早期干预,减轻 VTE 对患者的负担。