Suppr超能文献

基于多维动态临床特征的危重症患者死亡预测模型的构建

[Development of mortality prediction model for critically ill patients based on multidimensional and dynamic clinical characteristics].

作者信息

Zhao Shangping, Tang Guanxiu, Liu Pan, Guo Yanming, Yang Mingshi, Li Guohui

机构信息

Laboratory for Big Data and Decision, National University of Defense Technology, Changsha 410003, Hunan, China.

Department of Intensive Care Unit, the Third Xiangya Hospital, Central South University, Changsha 410013, Hunan, China.

出版信息

Zhonghua Wei Zhong Bing Ji Jiu Yi Xue. 2023 Apr;35(4):415-420. doi: 10.3760/cma.j.cn121430-20220607-00550.

Abstract

OBJECTIVE

To develop a mortality prediction model for critically ill patients based on multidimensional and dynamic clinical data collected by the hospital information system (HIS) using random forest algorithm, and to compare the prediction efficiency of the model with acute physiology and chronic health evaluation II (APACHE II) model.

METHODS

The clinical data of 10 925 critically ill patients aged over 14 years old admitted to the Third Xiangya Hospital of Central South University from January 2014 to June 2020 were extracted from the HIS system, and APACHE II scores of the critically ill patients were extracted. Expected mortality of patients was calculated according to the death risk calculation formula of APACHE II scoring system. A total of 689 samples with APACHE II score records were used as the test set, and the other 10 236 samples were used to establish the random forest model, of which 10% (n = 1 024) were randomly selected as the validation set and 90% (n = 9 212) were selected as the training set. According to the time series of 3 days before the end of critical illness, the clinical characteristics of patients such as general information, vital signs data, biochemical test results and intravenous drug doses were selected to develope a random forest model for predicting the mortality of critically ill patients. Using the APACHE II model as a reference, receiver operator characteristic curve (ROC curve) was drawn, and the discrimination performance of the model was evaluated through the area under the ROC curve (AUROC). According to the precision and recall, Precision-Recall curve (PR curve) was drawn, and the calibration performance of the model was evaluated through the area under the PR curve (AUPRC). Calibration curve was drawn, and the consistency between the predicted event occurrence probability of the model and the actual occurrence probability was evaluated through the calibration index Brier score.

RESULTS

Among the 10 925 patients, there were 7 797 males (71.4%) and 3 128 females (28.6%). The average age was (58.9±16.3) years old. The median length of hospital stay was 12 (7, 20) days. Most patients (n = 8 538, 78.2%) were admitted to intensive care unit (ICU), and the median length of ICU stay was 66 (13, 151) hours. The hospitalized mortality was 19.0% (2 077/10 925). Compared with the survival group (n = 8 848), the patients in the death group (n = 2 077) were older (years old: 60.1±16.5 vs. 58.5±16.4, P < 0.01), the ratio of ICU admission was higher [82.8% (1 719/2 077) vs. 77.1% (6 819/8 848), P < 0.01], and the proportion of patients with hypertension, diabetes and stroke history was also higher [44.7% (928/2 077) vs. 36.3% (3 212/8 848), 20.0% (415/2 077) vs. 16.9% (1 495/8 848), 15.5% (322/2 077) vs. 10.0% (885/8 848), all P < 0.01]. In the test set data, the prediction value of random forest model for the risk of death during hospitalization of critically ill patients was greater than that of APACHE II model, which showed by that the AUROC and AUPRC of random forest model were higher than those of APACHE II model [AUROC: 0.856 (95% confidence interval was 0.812-0.896) vs. 0.783 (95% confidence interval was 0.737-0.826), AUPRC: 0.650 (95% confidence interval was 0.604-0.762) vs. 0.524 (95% confidence interval was 0.439-0.609)], and Brier score was lower than that of APACHE II model [0.104 (95% confidence interval was 0.085-0.113) vs. 0.124 (95% confidence interval was 0.107-0.141)].

CONCLUSIONS

The random forest model based on multidimensional dynamic characteristics has great application value in predicting hospital mortality risk for critically ill patients, and it is superior to the traditional APACHE II scoring system.

摘要

目的

利用随机森林算法,基于医院信息系统(HIS)收集的多维动态临床数据,构建危重症患者的死亡预测模型,并将该模型与急性生理与慢性健康状况评分系统II(APACHE II)模型的预测效率进行比较。

方法

从HIS系统中提取2014年1月至2020年6月在中南大学湘雅三医院住院的10925例14岁以上危重症患者的临床数据,并提取危重症患者的APACHE II评分。根据APACHE II评分系统的死亡风险计算公式计算患者的预期死亡率。将689例有APACHE II评分记录的样本作为测试集,其余10236例样本用于建立随机森林模型,其中随机抽取10%(n = 1024)作为验证集,90%(n = 9212)作为训练集。根据危重症结束前3天的时间序列,选取患者的一般信息、生命体征数据、生化检验结果及静脉用药剂量等临床特征,构建危重症患者死亡预测的随机森林模型。以APACHE II模型为对照,绘制受试者工作特征曲线(ROC曲线),通过ROC曲线下面积(AUROC)评估模型的区分性能。根据精确率和召回率绘制精确率-召回率曲线(PR曲线),通过PR曲线下面积(AUPRC)评估模型的校准性能。绘制校准曲线,通过校准指数Brier评分评估模型预测事件发生概率与实际发生概率之间的一致性。

结果

10925例患者中,男性7797例(71.4%),女性3128例(28.6%)。平均年龄为(58.9±16.3)岁。住院时间中位数为12(7,20)天。多数患者(n = 8538,78.2%)入住重症监护病房(ICU),ICU住院时间中位数为66(13,151)小时。住院死亡率为19.0%(2077/10925)。与存活组(n = 8848)相比,死亡组(n = 2077)患者年龄更大(岁:60.1±16.5 vs. 58.5±16.4,P < 0.01),入住ICU比例更高[82.8%(1719/2077)vs. 77.1%(6819/8848),P < 0.01],有高血压、糖尿病和卒中病史的患者比例也更高[44.7%(928/2077)vs. 36.3%(3212/8848),20.0%(415/2077)vs. 16.9%(1495/8848),15.5%(322/2077)vs. 10.0%(885/8848),均P < 0.01]。在测试集数据中,随机森林模型对危重症患者住院死亡风险的预测值大于APACHE II模型,表现为随机森林模型的AUROC和AUPRC高于APACHE II模型[AUROC:0.856(95%置信区间为0.812 - 0.896)vs. 0.783(95%置信区间为0.737 - 0.826),AUPRC:0.650(95%置信区间为0.604 - 0.762)vs. 0.524(95%置信区间为0.439 - 0.609)],且Brier评分低于APACHE II模型[0.104(95%置信区间为0.085 - 0.113)vs. 0.12

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验