Zhang Tianyi, Qu Yimeng, Wang Deyong, Zhong Ming, Cheng Yunzhang, Zhang Mingwei
School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093 China.
Shanghai Interventional Medical Device Engineering Technology Research Center, Shanghai, 200093 China.
Biomed Eng Lett. 2024 Jan 4;14(2):279-289. doi: 10.1007/s13534-023-00343-2. eCollection 2024 Mar.
The existing sepsis treatment lacks effective reference and relies too much on the experience of clinicians. Therefore, we used the reinforcement learning model to build an assisted model for the sepsis medication treatment.
Using the latest Sepsis 3.0 diagnostic criteria, 19,582 sepsis patients were screened from the Medical Intensive Care Information III database for treatment strategy research, and forty-six features were used in modeling. The study object of the medication strategy is the dosage of vasopressor drugs and intravenous infusion. Dueling DDQN is proposed to predict the patient's medication strategy (vasopressor and intravenous infusion dosage) through the relationship between the patient's state, reward function, and medication action. We also constructed protection against the possible high-risk behaviors of Dueling DDQN, especially sudden dose changes of vasopressors can lead to harmful clinical effects. In order to improve the guiding effect of clinically effective medication strategies on the model, we proposed a hybrid model (safe-dueling DDQN + expert strategies) to optimize medication strategies.
The Dueling DDQN medication model for sepsis patients is superior to clinical strategies and other models in terms of off-policy evaluation values and mortality, and reduced the mortality of clinical strategies from 16.8 to 13.8%. Safe-Dueling DDQN we proposed, compared with Dueling DDQN, has an overall reduction in actions involving vasopressors and reduces large dose fluctuations. The hybrid model we proposed can switch between expert strategies and safe dueling DDQN strategies based on the current state of patients.
The reinforcement learning model we proposed for sepsis medication treatment, has practical clinical value and can improve the survival rate of patients to a certain extent while ensuring the balance and safety of medication.
现有的脓毒症治疗缺乏有效的参考依据,过于依赖临床医生的经验。因此,我们使用强化学习模型构建了一个脓毒症药物治疗辅助模型。
采用最新的脓毒症3.0诊断标准,从重症医学信息III数据库中筛选出19582例脓毒症患者进行治疗策略研究,并使用46个特征进行建模。药物策略的研究对象是血管活性药物的剂量和静脉输液。提出了对决双深度Q网络(Dueling DDQN),通过患者状态、奖励函数和用药行为之间的关系来预测患者的用药策略(血管活性药物和静脉输液剂量)。我们还构建了针对Dueling DDQN可能出现的高风险行为的保护措施,特别是血管活性药物的突然剂量变化可能导致有害的临床效果。为了提高临床有效用药策略对模型的指导作用,我们提出了一种混合模型(安全对决双深度Q网络+专家策略)来优化用药策略。
脓毒症患者的Dueling DDQN用药模型在离策略评估值和死亡率方面优于临床策略和其他模型,并将临床策略的死亡率从16.8%降至13.8%。我们提出的安全对决双深度Q网络与Dueling DDQN相比,涉及血管活性药物的行为总体减少,且减少了大剂量波动。我们提出的混合模型可以根据患者的当前状态在专家策略和安全对决双深度Q网络策略之间切换。
我们提出的用于脓毒症药物治疗的强化学习模型具有实际临床价值,在确保用药平衡和安全的同时,能在一定程度上提高患者的生存率。