Park Chanmin, Han Changho, Jang Su Kyeong, Kim Hyungjun, Kim Sora, Kang Byung Hee, Jung Kyoungwon, Yoon Dukyong
Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Seoul, Republic of Korea.
BUD.on Inc., Seoul, Republic of Korea.
J Med Internet Res. 2025 Apr 2;27:e59520. doi: 10.2196/59520.
Delirium in intensive care unit (ICU) patients poses a significant challenge, affecting patient outcomes and health care efficiency. Developing an accurate, real-time prediction model for delirium represents an advancement in critical care, addressing needs for timely intervention and resource optimization in ICUs.
We aimed to create a novel machine learning model for delirium prediction in ICU patients using only continuous physiological data.
We developed models integrating routinely available clinical data, such as age, sex, and patient monitoring device outputs, to ensure practicality and adaptability in diverse clinical settings. To confirm the reliability of delirium determination records, we prospectively collected results of Confusion Assessment Method for the ICU (CAM-ICU) evaluations performed by qualified investigators from May 17, 2021, to December 23, 2022, determining Cohen κ coefficients. Participants were included in the study if they were aged ≥18 years at ICU admission, had delirium evaluations using the CAM-ICU, and had data collected for at least 4 hours before delirium diagnosis or nondiagnosis. The development cohort from Yongin Severance Hospital (March 1, 2020, to January 12, 2022) comprised 5478 records: 5129 (93.62%) records from 651 patients for training and 349 (6.37%) records from 163 patients for internal validation. For temporal validation, we used 4438 records from the same hospital (January 28, 2022, to December 31, 2022) to reflect potential seasonal variations. External validation was performed using data from 670 patients at Ajou University Hospital (March 2022 to September 2022). We evaluated machine learning algorithms (random forest [RF], extra-trees classifier, and light gradient boosting machine) and selected the RF model as the final model based on its performance. To confirm clinical utility, a decision curve analysis and temporal pattern for model prediction during the ICU stay were performed.
The κ coefficient between labels generated by ICU nurses and prospectively verified by qualified researchers was 0.81, indicating reliable CAM-ICU results. Our final model showed robust performance in internal validation (area under the receiver operating characteristic curve [AUROC]: 0.82; area under the precision-recall curve [AUPRC]: 0.62) and maintained its accuracy in temporal validation (AUROC: 0.73; AUPRC: 0.85). External validation supported its effectiveness (AUROC: 0.84; AUPRC: 0.77). Decision curve analysis showed a positive net benefit at all thresholds, and the temporal pattern analysis showed a gradual increase in the model scores as the actual delirium diagnosis time approached.
We developed a machine learning model for delirium prediction in ICU patients using routinely measured variables, including physiological waveforms. Our study demonstrates the potential of the RF model in predicting delirium, with consistent performance across various validation scenarios. The model uses noninvasive variables, making it applicable to a wide range of ICU patients, with minimal additional risk.
重症监护病房(ICU)患者的谵妄构成了重大挑战,影响患者预后和医疗保健效率。开发一种准确的、实时的谵妄预测模型是重症监护领域的一项进步,可满足ICU中及时干预和资源优化的需求。
我们旨在仅使用连续生理数据创建一种用于预测ICU患者谵妄的新型机器学习模型。
我们开发了整合常规可用临床数据(如年龄、性别和患者监测设备输出)的模型,以确保在不同临床环境中的实用性和适应性。为了确认谵妄判定记录的可靠性,我们前瞻性地收集了2021年5月17日至2022年12月23日由合格研究人员进行的ICU意识模糊评估方法(CAM-ICU)评估结果,计算科恩κ系数。如果参与者在ICU入院时年龄≥18岁,使用CAM-ICU进行谵妄评估,并且在谵妄诊断或未诊断前至少收集了4小时的数据,则纳入本研究。龙仁圣母医院的开发队列(2020年3月1日至2022年1月12日)包括5478条记录:来自651名患者的5129条(93.62%)记录用于训练,来自163名患者的349条(6.37%)记录用于内部验证。为了进行时间验证,我们使用了同一家医院(2022年1月28日至2022年12月31日)的4438条记录以反映潜在的季节变化。使用庆熙大学医院670名患者的数据(2022年3月至2022年9月)进行外部验证。我们评估了机器学习算法(随机森林[RF]、极端随机树分类器和轻梯度提升机),并根据其性能选择RF模型作为最终模型。为了确认临床实用性,进行了决策曲线分析和ICU住院期间模型预测的时间模式分析。
ICU护士生成并经合格研究人员前瞻性验证的标签之间的κ系数为0.81,表明CAM-ICU结果可靠。我们的最终模型在内部验证中表现出强大的性能(受试者工作特征曲线下面积[AUROC]:0.82;精确召回率曲线下面积[AUPRC]:0.62),并在时间验证中保持其准确性(AUROC:0.73;AUPRC:0.85)。外部验证支持其有效性(AUROC:0.84;AUPRC:0.77)。决策曲线分析显示在所有阈值下净效益均为正,时间模式分析显示随着实际谵妄诊断时间的临近,模型分数逐渐增加。
我们使用包括生理波形在内的常规测量变量开发了一种用于预测ICU患者谵妄的机器学习模型。我们的研究证明了RF模型在预测谵妄方面的潜力,在各种验证场景中具有一致的性能。该模型使用非侵入性变量,适用于广泛的ICU患者,额外风险最小。