Zhang Hongwei, Diao Mengyuan, Zhang Sheng, Ni Peifeng, Zhang Weidong, Wu Chenxi, Zhu Ying, Hu Wei
Department of Critical Care Medicine, Affiliated Hangzhou First People's Hospital, School of Medicine, Westlake University, 261 Huansha Road, Hangzhou, 310006, China, 86 13634164536.
Department of Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University, School of Medicine, Shanghai, China.
J Med Internet Res. 2025 Jul 3;27:e63847. doi: 10.2196/63847.
Traumatic brain injury (TBI) is a critically ill disease with a high mortality rate, and clinical treatment is committed to continuously optimizing treatment strategies to improve survival rates.
This study aims to establish a reinforcement learning algorithm (RL) to optimize the survival prognosis decision-making scheme for patients with TBI in the intensive care unit.
We included a total of 2745 patients from the Medical Information Mart for Intensive Care (MIMIC)-IV database and randomly divided them into a training set and an internal validation set at 8:2. We extracted 34 features for analysis and modeling using a 2-hour time compensation, 2 action features (mean arterial pressure and temperature), and 1 outcome feature (survival status at 28 d). We used an RL algorithm called weighted dueling double deep Q-network with embedded human expertise to maximize cumulative returns and evaluated the model using a doubly robust off-policy evaluation method. Finally, we collected 2463 patients with TBI from MIMIC III as an external validation set to test the model.
The action features are divided into 6 intervals, and the expected benefits are estimated using a doubly robust off-policy evaluation method. The results indicate that the survival rate of artificial intelligence (AI) strategies is higher than that of clinical doctors (88.016%, 95% CI 85.191%-90.840% vs 81.094%, 95% CI 80.422%-81.765%), with an expected return of 28.978 (95% CI 28.797-29.160) versus 27.092 (95% CI 24.584-29.600). Compared with clinical doctors, AI algorithms select normal temperatures more frequently (36.56 °C to 36.83 ℃) and recommend mean arterial pressure levels of 87.5-95.0 mm Hg. In external validation, the AI strategy still has a high survival rate of 87.565%, with an expected return of 27.517.
This RL algorithm for patients with TBI indicates that a more personalized and targeted optimization of the vital signs is possible. This algorithm will assist clinicians in making decisions on an individualized patient-by-patient basis.
创伤性脑损伤(TBI)是一种死亡率很高的危重病,临床治疗致力于不断优化治疗策略以提高生存率。
本研究旨在建立一种强化学习算法(RL),以优化重症监护病房中TBI患者的生存预后决策方案。
我们从重症监护医学信息集市(MIMIC)-IV数据库中纳入了总共2745例患者,并以8:2的比例将他们随机分为训练集和内部验证集。我们提取了34个特征进行分析和建模,使用2小时的时间补偿、2个行动特征(平均动脉压和体温)和1个结果特征(28天的生存状态)。我们使用一种名为加权决斗双深度Q网络并嵌入人类专业知识的RL算法来最大化累积回报,并使用双重稳健的离策略评估方法评估模型。最后,我们从MIMIC III中收集了2463例TBI患者作为外部验证集来测试该模型。
行动特征分为6个区间,并使用双重稳健的离策略评估方法估计预期收益。结果表明,人工智能(AI)策略的生存率高于临床医生(88.016%,95%置信区间85.191%-90.840%对81.094%,95%置信区间80.422%-81.765%),预期回报为28.978(95%置信区间28.797-29.160)对27.092(95%置信区间24.584-29.600)。与临床医生相比,AI算法更频繁地选择正常体温(36.56°C至36.83°C),并推荐平均动脉压水平为87.5-95.0 mmHg。在外部验证中,AI策略的生存率仍然很高,为87.565%,预期回报为27.517。
这种针对TBI患者的RL算法表明,对生命体征进行更个性化、更有针对性的优化是可能的。该算法将帮助临床医生逐例做出个性化决策。