Liu Xianglin, Huang Zhihua, Guo Yizhi, Li Yandeng, Zhu Jianming, Wen Jun, Gao Yunchun, Liu Jianyi
Changde Hospital, Xiangya School of Medicine, Central South University (The First People's Hospital of Changde City), Changde, China.
J Med Internet Res. 2025 Apr 28;27:e71413. doi: 10.2196/71413.
Sepsis is a life-threatening condition frequently observed in patients with intracerebral hemorrhage (ICH) who are critically ill. Early and accurate identification and prediction of sepsis are crucial. Machine learning (ML)-based predictive models exhibit promising sepsis prediction capabilities in emergency settings. However, their application in predicting sepsis among patients with ICH is still limited.
The aim of the study is to develop an ML-driven risk calculator for early prediction of sepsis in patients with ICH who are critically ill and to clarify feature importance and explain the model using the Shapley Additive Explanations method.
Patients with ICH admitted to the intensive care unit (ICU) from the Medical Information Mart for Intensive Care IV database between 2008 and 2022 were divided into training and internal test sets. The external test was performed using the eICU Collaborative Research Database, which includes over 200,000 ICU admissions across the United States between 2014 and 2015. Sepsis following ICU admission was identified using Sepsis-3.0 through clinical diagnosis combining elevation of the Sequential Organ Failure Assessment by ≥2 points with suspected infection. The Boruta algorithm was used for feature selection, confirming 29 features. Nine ML algorithms were used to construct the prediction models. Predictive performance was compared using several evaluation metrics, including the area under the receiver operating characteristic curve (AUC). The Shapley Additive Explanations technique was used to interpret the final model, and a web-based risk calculator was constructed for clinical practice.
Overall, 2414 patients with ICH were enrolled from the Medical Information Mart for Intensive Care IV database, with 1689 and 725 patients assigned to the training and internal test sets, respectively. An external test set of 2806 patients with ICH from the eICU database was used. Among the 9 ML models tested, the categorical boosting (CatBoost) model demonstrated the best discriminative ability. After reducing features based on their importance, an explainable final CatBoost model was developed using 8 features. The final model accurately predicted sepsis in internal (AUC=0.812) and external (AUC=0.771) tests.
We constructed a web-based risk calculator with 8 features based on the CatBoost model to assist clinicians in identifying people at high risk for sepsis in patients with ICH who are critically ill.
脓毒症是一种危及生命的病症,在重症脑出血(ICH)患者中经常出现。早期准确识别和预测脓毒症至关重要。基于机器学习(ML)的预测模型在紧急情况下展现出了有前景的脓毒症预测能力。然而,其在预测ICH患者脓毒症方面的应用仍然有限。
本研究旨在开发一种由ML驱动的风险计算器,用于早期预测重症ICH患者的脓毒症,并阐明特征重要性,使用Shapley加法解释方法解释模型。
将2008年至2022年间从重症监护医学信息集市IV数据库收治入重症监护病房(ICU)的ICH患者分为训练集和内部测试集。外部测试使用eICU协作研究数据库进行,该数据库包含2014年至2015年间美国各地超过20万例ICU入院病例。通过将序贯器官衰竭评估升高≥2分并伴有疑似感染的临床诊断,使用Sepsis-3.0来识别ICU入院后的脓毒症。使用Boruta算法进行特征选择,确定了29个特征。使用9种ML算法构建预测模型。使用包括受试者操作特征曲线下面积(AUC)在内的几种评估指标比较预测性能。使用Shapley加法解释技术解释最终模型,并构建了基于网络的风险计算器用于临床实践。
总体而言,从重症监护医学信息集市IV数据库纳入了2414例ICH患者,分别有1689例和725例患者被分配到训练集和内部测试集。使用了来自eICU数据库的2806例ICH患者的外部测试集。在测试的9种ML模型中,分类提升(CatBoost)模型表现出最佳的判别能力。根据特征重要性减少特征后,使用8个特征开发了一个可解释的最终CatBoost模型。最终模型在内部测试(AUC = 0.812)和外部测试(AUC = 0.771)中准确预测了脓毒症。
我们基于CatBoost模型构建了一个具有8个特征的基于网络的风险计算器,以帮助临床医生识别重症ICH患者中脓毒症的高危人群。