Luo Xiang, Kan Xiuji, Wang Dongliang, Shi Yu, Zhu Siqi, Chen Zhenyu, Wang Congcong, Zhu Wenqi, Wang Xiangtong, Sun Wenwen
Jiangsu Province (Suqian) Hospital, Nanjing Medical University, China.
First Affiliated Hospital of Soochow University, Soochow University, China.
J Int Med Res. 2025 Aug;53(8):3000605251362991. doi: 10.1177/03000605251362991. Epub 2025 Aug 13.
BackgroundSepsis is the leading cause of mortality in critically ill cancer patients; however, traditional prognostic models fail to capture the complexity of their immune and physiological interactions.MethodsThis retrospective study analyzed electronic health records from the Medical Information Mart for Intensive Care IV database, including the records of patients with sepsis who had a documented history of cancer and were admitted to the intensive care unit. A two-step feature selection approach, combining least absolute shrinkage and selection operator regression and recursive feature elimination, was used to identify key prognostic variables. Eight machine learning algorithms, such as random forest and extreme gradient boosting, were trained and evaluated using five-fold cross-validation. Model performance was assessed using the area under the receiver operating characteristic curve value, Brier scores, sensitivity, and specificity. SHapley Additive exPlanations, Partial Dependence Plots, and break down algorithms were applied to enhance model interpretability.ResultsThe final cohort included 3364 patients admitted to the intensive care unit. Nonsurvivors had significantly higher illness severity scores (Acute Physiology Score III and Sequential Organ Failure Assessment) than survivors (p < 0.001). Among the tested models, the random forest model demonstrated superior performance, achieving the highest area under the receiver operating characteristic curve value (0.78; 95% confidence interval: 0.76-0.80) and the lowest Brier score (0.15), indicating strong predictive accuracy.ConclusionsThis study developed machine learning models for predicting in-hospital mortality in sepsis patients with a history of cancer, leveraging the Medical Information Mart for Intensive Care IV database for comprehensive risk assessment.
背景
脓毒症是重症癌症患者死亡的主要原因;然而,传统的预后模型未能捕捉到其免疫和生理相互作用的复杂性。
方法
这项回顾性研究分析了重症监护医学信息数据库IV中的电子健康记录,包括有癌症病史且入住重症监护病房的脓毒症患者的记录。采用一种两步特征选择方法,结合最小绝对收缩和选择算子回归以及递归特征消除,来识别关键的预后变量。使用随机森林和极端梯度提升等八种机器学习算法,通过五折交叉验证进行训练和评估。使用受试者工作特征曲线下面积值、布里尔评分、敏感性和特异性来评估模型性能。应用夏普利值、部分依赖图和分解算法来增强模型的可解释性。
结果
最终队列包括3364名入住重症监护病房的患者。非幸存者的疾病严重程度评分(急性生理学评分III和序贯器官衰竭评估)显著高于幸存者(p < 0.001)。在测试的模型中,随机森林模型表现出卓越的性能,达到了最高的受试者工作特征曲线下面积值(0.78;95%置信区间:0.76 - 0.80)和最低的布里尔评分(0.15),表明具有很强的预测准确性。
结论
本研究利用重症监护医学信息数据库IV开发了机器学习模型,用于预测有癌症病史的脓毒症患者的院内死亡率,以进行全面的风险评估。