Choi Dong Hyun, Lim Min Hyuk, Hong Ki Jeong, Kim Young Gyun, Park Jeong Ho, Song Kyoung Jun, Do Shin Sang, Kim Sungwan
Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, South Korea.
Graduate School of Health Science and Technology, Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.
NPJ Digit Med. 2024 Oct 9;7(1):276. doi: 10.1038/s41746-024-01278-3.
On-scene resuscitation time is associated with out-of-hospital cardiac arrest (OHCA) outcomes. We developed and validated reinforcement learning models for individualized on-scene resuscitation times, leveraging nationwide Korean data. Adult OHCA patients with a medical cause of arrest were included (N = 73,905). The optimal policy was derived from conservative Q-learning to maximize survival. The on-scene return of spontaneous circulation hazard rates estimated from the Random Survival Forest were used as intermediate rewards to handle sparse rewards, while patients' historical survival was reflected in the terminal rewards. The optimal policy increased the survival to hospital discharge rate from 9.6% to 12.5% (95% CI: 12.2-12.8) and the good neurological recovery rate from 5.4% to 7.5% (95% CI: 7.3-7.7). The recommended maximum on-scene resuscitation times for patients demonstrated a bimodal distribution, varying with patient, emergency medical services, and OHCA characteristics. Our survival analysis-based approach generates explainable rewards, reducing subjectivity in reinforcement learning.
现场复苏时间与院外心脏骤停(OHCA)的结局相关。我们利用韩国全国性数据,开发并验证了用于个性化现场复苏时间的强化学习模型。纳入了因医疗原因导致心脏骤停的成年OHCA患者(N = 73,905)。通过保守Q学习得出最优策略,以最大化生存率。将随机生存森林估计的现场自主循环恢复危险率用作中间奖励来处理稀疏奖励,而患者的历史生存情况则反映在终端奖励中。最优策略将出院生存率从9.6%提高到了12.5%(95%置信区间:12.2 - 12.8),并将良好神经功能恢复率从5.4%提高到了7.5%(95%置信区间:7.3 - 7.7)。为患者推荐的最大现场复苏时间呈现双峰分布,随患者、紧急医疗服务和OHCA特征而变化。我们基于生存分析的方法产生可解释的奖励,减少了强化学习中的主观性。