Department of Rehabilitation, The Second Affiliated Hospital of Jianghan University, Wuhan, China.
Department of Respiratory Medicine, People's Hospital of Daye, The Second Affiliated Hospital of Hubei Polytechnic University, Daye, Hubei, China.
PLoS One. 2023 Jan 26;18(1):e0280606. doi: 10.1371/journal.pone.0280606. eCollection 2023.
BACKGROUNDS: The in-hospital mortality in lung cancer patients admitted to intensive care unit (ICU) is extremely high. This study intended to adopt machine learning algorithm models to predict in-hospital mortality of critically ill lung cancer for providing relative information in clinical decision-making. METHODS: Data were extracted from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) for a training cohort and data extracted from the Medical Information Mart for eICU Collaborative Research Database (eICU-CRD) database for a validation cohort. Logistic regression, random forest, decision tree, light gradient boosting machine (LightGBM), eXtreme gradient boosting (XGBoost), and an ensemble (random forest+LightGBM+XGBoost) model were used for prediction of in-hospital mortality and important feature extraction. The AUC (area under receiver operating curve), accuracy, F1 score and recall were used to evaluate the predictive performance of each model. Shapley Additive exPlanations (SHAP) values were calculated to evaluate feature importance of each feature. RESULTS: Overall, there were 653 (24.8%) in-hospital mortality in the training cohort, and 523 (21.7%) in-hospital mortality in the validation cohort. Among the six machine learning models, the ensemble model achieved the best performance. The top 5 most influential features were the sequential organ failure assessment (SOFA) score, albumin, the oxford acute severity of illness score (OASIS) score, anion gap and bilirubin in random forest and XGBoost model. The SHAP summary plot was used to illustrate the positive or negative effects of the top 15 features attributed to the XGBoost model. CONCLUSION: The ensemble model performed best and might be applied to forecast in-hospital mortality of critically ill lung cancer patients, and the SOFA score was the most important feature in all models. These results might offer valuable and significant reference for ICU clinicians' decision-making in advance.
背景:入住重症监护病房(ICU)的肺癌患者的院内死亡率极高。本研究旨在采用机器学习算法模型预测危重症肺癌患者的院内死亡率,为临床决策提供相关信息。
方法:从医疗信息监护-IV (MIMIC-IV)数据库中提取训练队列的数据,从电子 ICU 协作研究数据库(eICU-CRD)数据库中提取验证队列的数据。采用逻辑回归、随机森林、决策树、轻梯度提升机(LightGBM)、极端梯度提升(XGBoost)和集成(随机森林+LightGBM+XGBoost)模型预测院内死亡率和重要特征提取。使用 AUC(接收者操作特征曲线下面积)、准确性、F1 得分和召回率来评估每个模型的预测性能。计算 Shapley 加法解释值(SHAP)来评估每个特征的特征重要性。
结果:在训练队列中,共有 653 例(24.8%)发生院内死亡,在验证队列中,共有 523 例(21.7%)发生院内死亡。在六种机器学习模型中,集成模型的性能最佳。在随机森林和 XGBoost 模型中,前 5 个最具影响力的特征是序贯器官衰竭评估(SOFA)评分、白蛋白、牛津急性疾病严重程度评分(OASIS)评分、阴离子间隙和胆红素。使用 SHAP 汇总图来说明归因于 XGBoost 模型的前 15 个特征的正负效应。
结论:集成模型表现最佳,可用于预测危重症肺癌患者的院内死亡率,SOFA 评分是所有模型中最重要的特征。这些结果可能为 ICU 临床医生的决策提供有价值的重要参考。
Front Cardiovasc Med. 2022-10-12
Front Cardiovasc Med. 2023-4-3
Support Care Cancer. 2025-6-13
Comput Biol Med. 2022-1
Lancet. 2021-8-7
Ann Transl Med. 2021-5
Cancer Res Treat. 2022-4
Am J Respir Crit Care Med. 2021-8-15
Thorac Cancer. 2021-5