E-CatBoost：一种利用 eICU 协作研究数据库预测 ICU 死亡率的高效机器学习框架。

E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database.

机构信息

Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America.

Civil and Environmental Engineering Department, Michigan State University, East Lansing, MI, United States of America.

出版信息

PLoS One. 2022 May 5;17(5):e0262895. doi: 10.1371/journal.pone.0262895. eCollection 2022.

DOI:10.1371/journal.pone.0262895

PMID:35511882

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9070907/

Abstract

Improving the Intensive Care Unit (ICU) management network and building cost-effective and well-managed healthcare systems are high priorities for healthcare units. Creating accurate and explainable mortality prediction models helps identify the most critical risk factors in the patients' survival/death status and early detect the most in-need patients. This study proposes a highly accurate and efficient machine learning model for predicting ICU mortality status upon discharge using the information available during the first 24 hours of admission. The most important features in mortality prediction are identified, and the effects of changing each feature on the prediction are studied. We used supervised machine learning models and illness severity scoring systems to benchmark the mortality prediction. We also implemented a combination of SHAP, LIME, partial dependence, and individual conditional expectation plots to explain the predictions made by the best-performing model (CatBoost). We proposed E-CatBoost, an optimized and efficient patient mortality prediction model, which can accurately predict the patients' discharge status using only ten input features. We used eICU-CRD v2.0 to train and validate the models; the dataset contains information on over 200,000 ICU admissions. The patients were divided into twelve disease groups, and models were fitted and tuned for each group. The models' predictive performance was evaluated using the area under a receiver operating curve (AUROC). The AUROC scores were 0.86 [std:0.02] to 0.92 [std:0.02] for CatBoost and 0.83 [std:0.02] to 0.91 [std:0.03] for E-CatBoost models across the defined disease groups; if measured over the entire patient population, their AUROC scores were 7 to 18 and 2 to 12 percent higher than the baseline models, respectively. Based on SHAP explanations, we found age, heart rate, respiratory rate, blood urine nitrogen, and creatinine level as the most critical cross-disease features in mortality predictions.

摘要

改善重症监护病房 (ICU) 管理网络并建立具有成本效益和良好管理的医疗保健系统是医疗单位的重中之重。创建准确且可解释的死亡率预测模型有助于确定患者生存/死亡状态中的最关键风险因素，并及早发现最需要的患者。本研究提出了一种基于患者入院后 24 小时内可用信息预测 ICU 出院时死亡率的高度准确和高效的机器学习模型。确定了死亡率预测中最重要的特征，并研究了改变每个特征对预测的影响。我们使用有监督机器学习模型和疾病严重程度评分系统来对死亡率预测进行基准测试。我们还实现了 SHAP、LIME、部分依赖和个体条件期望图的组合，以解释表现最佳的模型 (CatBoost) 做出的预测。我们提出了 E-CatBoost，这是一种经过优化和高效的患者死亡率预测模型，仅使用十个输入特征即可准确预测患者的出院状态。我们使用 eICU-CRD v2.0 来训练和验证模型；该数据集包含 200,000 多名 ICU 入院患者的信息。将患者分为十二种疾病组，并为每组拟合和调整模型。使用接收器操作曲线下的面积 (AUROC) 评估模型的预测性能。CatBoost 和 E-CatBoost 模型在定义的疾病组中的 AUROC 评分分别为 0.86 [std:0.02] 至 0.92 [std:0.02] 和 0.83 [std:0.02] 至 0.91 [std:0.03]；如果在整个患者群体中进行衡量，它们的 AUROC 评分分别比基线模型高 7%至 18%和 2%至 12%。根据 SHAP 解释，我们发现年龄、心率、呼吸率、血液尿液氮和肌酐水平是死亡率预测中最关键的跨疾病特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff48/9070907/908de298afb3/pone.0262895.g001.jpg

相似文献

E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database.E-CatBoost：一种利用 eICU 协作研究数据库预测 ICU 死亡率的高效机器学习框架。

PLoS One. 2022 May 5;17(5):e0262895. doi: 10.1371/journal.pone.0262895. eCollection 2022.

Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach.基于集成学习方法的重症监护病房患者早期住院病死率预测。

Int J Med Inform. 2017 Dec;108:185-195. doi: 10.1016/j.ijmedinf.2017.10.002. Epub 2017 Oct 5.

Mortality prediction for patients with acute respiratory distress syndrome based on machine learning: a population-based study.基于机器学习的急性呼吸窘迫综合征患者死亡率预测：一项基于人群的研究。

Ann Transl Med. 2021 May;9(9):794. doi: 10.21037/atm-20-6624.

Interpretable machine learning model for early prediction of 28-day mortality in ICU patients with sepsis-induced coagulopathy: development and validation.用于脓毒症诱导性凝血病 ICU 患者 28 天死亡率早期预测的可解释机器学习模型：开发与验证。

Eur J Med Res. 2024 Jan 3;29(1):14. doi: 10.1186/s40001-023-01593-7.

Early Prediction of Cardiac Arrest in the Intensive Care Unit Using Explainable Machine Learning: Retrospective Study.使用可解释机器学习对重症监护病房中的心脏骤停进行早期预测：回顾性研究。

J Med Internet Res. 2024 Sep 17;26:e62890. doi: 10.2196/62890.

Prediction of respiratory failure risk in patients with pneumonia in the ICU using ensemble learning models.使用集成学习模型预测 ICU 肺炎患者的呼吸衰竭风险。

PLoS One. 2023 Sep 21;18(9):e0291711. doi: 10.1371/journal.pone.0291711. eCollection 2023.

Interpretable machine learning for 28-day all-cause in-hospital mortality prediction in critically ill patients with heart failure combined with hypertension: A retrospective cohort study based on medical information mart for intensive care database-IV and eICU databases.用于预测心力衰竭合并高血压重症患者28天全因院内死亡率的可解释机器学习：一项基于重症监护医学信息集市数据库-IV和电子重症监护病房数据库的回顾性队列研究

Front Cardiovasc Med. 2022 Oct 12;9:994359. doi: 10.3389/fcvm.2022.994359. eCollection 2022.

Prediction of in-hospital Mortality of Intensive Care Unit Patients with Acute Pancreatitis Based on an Explainable Machine Learning Algorithm.基于可解释机器学习算法的重症监护病房急性胰腺炎患者住院死亡率预测。

J Clin Gastroenterol. 2024 Jul 1;58(6):619-626. doi: 10.1097/MCG.0000000000001910.

Dynamic and explainable machine learning prediction of mortality in patients in the intensive care unit: a retrospective study of high-frequency data in electronic patient records.动态可解释机器学习预测 ICU 患者死亡率：电子患者记录中高频数据的回顾性研究。

Lancet Digit Health. 2020 Apr;2(4):e179-e191. doi: 10.1016/S2589-7500(20)30018-2. Epub 2020 Mar 12.

Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study.利用可解释机器学习模型预测重症监护病房心力衰竭患者的死亡率：回顾性队列研究。

J Med Internet Res. 2022 Aug 9;24(8):e38082. doi: 10.2196/38082.

引用本文的文献

Enhancing glucose level prediction of ICU patients through hierarchical modeling of irregular time-series.通过对不规则时间序列进行分层建模来增强对重症监护病房患者血糖水平的预测。

Comput Struct Biotechnol J. 2025 Jul 1;27:2898-2914. doi: 10.1016/j.csbj.2025.06.039. eCollection 2025.

Improving meningitis surveillance and diagnosis with machine learning: Insights from São Paulo.利用机器学习改善脑膜炎监测与诊断：来自圣保罗的见解。

PLOS Digit Health. 2025 Jul 10;4(7):e0000925. doi: 10.1371/journal.pdig.0000925. eCollection 2025 Jul.

Enhancing patient rehabilitation outcomes: artificial intelligence-driven predictive modeling for home discharge in neurological and orthopedic conditions.提高患者康复效果：针对神经科和骨科疾病出院居家情况的人工智能驱动预测模型

J Neuroeng Rehabil. 2025 May 26;22(1):117. doi: 10.1186/s12984-025-01654-4.

Machine learning model-based prediction of postpancreatectomy acute pancreatitis following pancreaticoduodenectomy: A retrospective cohort study.基于机器学习模型对胰十二指肠切除术后胰十二指肠切除术后急性胰腺炎的预测：一项回顾性队列研究。

World J Gastroenterol. 2025 Feb 28;31(8):102071. doi: 10.3748/wjg.v31.i8.102071.

Development and validation of a deep learning-enhanced prediction model for the likelihood of pulmonary embolism.用于预测肺栓塞可能性的深度学习增强预测模型的开发与验证

Front Med (Lausanne). 2025 Feb 6;12:1506363. doi: 10.3389/fmed.2025.1506363. eCollection 2025.

Advancing ensemble learning techniques for residential building electricity consumption forecasting: Insight from explainable artificial intelligence.推进集成学习技术在住宅建筑用电预测中的应用：可解释人工智能的启示。

PLoS One. 2024 Nov 14;19(11):e0307654. doi: 10.1371/journal.pone.0307654. eCollection 2024.

A machine learning-based prediction of hospital mortality in mechanically ventilated ICU patients.基于机器学习的机械通气 ICU 患者院内死亡率预测。

PLoS One. 2024 Sep 4;19(9):e0309383. doi: 10.1371/journal.pone.0309383. eCollection 2024.

Development and validation of a machine learning predictive model for perioperative myocardial injury in cardiac surgery with cardiopulmonary bypass.开发和验证体外循环心脏手术围术期心肌损伤的机器学习预测模型。

J Cardiothorac Surg. 2024 Jun 26;19(1):384. doi: 10.1186/s13019-024-02856-y.

Machine learning-based prediction of low-value care for hospitalized patients.基于机器学习的住院患者低价值医疗预测

Intell Based Med. 2023;8. doi: 10.1016/j.ibmed.2023.100115. Epub 2023 Oct 23.

Machine Learning for Benchmarking Critical Care Outcomes.用于重症监护结果基准测试的机器学习

Healthc Inform Res. 2023 Oct;29(4):301-314. doi: 10.4258/hir.2023.29.4.301. Epub 2023 Oct 31.

本文引用的文献

Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond.通过多模态和多中心数据融合开启医学可解释人工智能的黑匣子：一篇综述、两个案例展示及其他

Inf Fusion. 2022 Jan;77:29-52. doi: 10.1016/j.inffus.2021.07.016.

A Clinically Practical and Interpretable Deep Model for ICU Mortality Prediction with External Validation.临床实用且可解释的 ICU 死亡率预测深度学习模型及其外部验证。

AMIA Annu Symp Proc. 2021 Jan 25;2020:629-637. eCollection 2020.

An explainable machine learning algorithm for risk factor analysis of in-hospital mortality in sepsis survivors with ICU readmission.一种用于分析脓毒症幸存者再次入住重症监护病房时院内死亡风险因素的可解释机器学习算法。

Comput Methods Programs Biomed. 2021 Jun;204:106040. doi: 10.1016/j.cmpb.2021.106040. Epub 2021 Mar 7.

DeepConsensus: Consensus-based Interpretable Deep Neural Networks with Application to Mortality Prediction.深度共识：基于共识的可解释深度神经网络及其在死亡率预测中的应用

Proc Int Jt Conf Neural Netw. 2020 Jul;2020. doi: 10.1109/ijcnn48605.2020.9206678. Epub 2020 Sep 28.

Prognostic Assessment of COVID-19 in the Intensive Care Unit by Machine Learning Methods: Model Development and Validation.通过机器学习方法对重症监护病房中新冠肺炎的预后评估：模型开发与验证

J Med Internet Res. 2020 Nov 11;22(11):e23128. doi: 10.2196/23128.

Prognostic utilization of models based on the APACHE II, APACHE IV, and SAPS II scores for predicting in-hospital mortality in emergency department.基于急性生理与慢性健康状况评分系统 II（APACHE II）、急性生理与慢性健康状况评分系统 IV（APACHE IV）和简化急性生理学评分系统 II（SAPS II）的模型在急诊科预测住院病死率中的预后应用。

Am J Emerg Med. 2020 Sep;38(9):1841-1846. doi: 10.1016/j.ajem.2020.05.053. Epub 2020 May 23.

Effect of heart rate on hospital mortality in critically ill patients may be modified by age: a retrospective observational study from large database.心率对危重症患者住院死亡率的影响可能会受到年龄的影响：来自大型数据库的回顾性观察研究。

Aging Clin Exp Res. 2021 May;33(5):1325-1335. doi: 10.1007/s40520-020-01644-7. Epub 2020 Jul 7.

Benchmarking machine learning models on multi-centre eICU critical care dataset.基于多中心 eICU 重症监护数据集的机器学习模型基准测试。

PLoS One. 2020 Jul 2;15(7):e0235424. doi: 10.1371/journal.pone.0235424. eCollection 2020.

Improving an Intelligent Detection System for Coronary Heart Disease Using a Two-Tier Classifier Ensemble.利用双层分类器集成提高冠心病智能检测系统

Biomed Res Int. 2020 Apr 27;2020:9816142. doi: 10.1155/2020/9816142. eCollection 2020.

Explainable Machine Learning Model for Predicting GI Bleed Mortality in the Intensive Care Unit.用于预测 ICU 内胃肠道出血死亡率的可解释机器学习模型。

Am J Gastroenterol. 2020 Oct;115(10):1657-1668. doi: 10.14309/ajg.0000000000000632.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

E-CatBoost：一种利用 eICU 协作研究数据库预测 ICU 死亡率的高效机器学习框架。

E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献