Suppr超能文献

基于监督演员-评论员的强化学习算法在重症监护病房智能机械通气和镇静药物剂量调节中的应用

Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.

机构信息

School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, 510015, China.

School of Computer Science and Technology, Dalian University of Technology, Dalian, 110621, China.

出版信息

BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):124. doi: 10.1186/s12911-020-1120-5.

Abstract

BACKGROUND

Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the goal of traditional RL algorithms is to maximize a long-term reward function, exploration in the learning process may have a fatal impact on the patient. As such, a short-term goal should also be considered to keep the patient stable during the treating process.

METHODS

We use a Supervised-Actor-Critic (SAC) RL algorithm to address this problem by combining the long-term goal-oriented characteristics of RL with the short-term goal of supervised learning. We evaluate the differences between SAC and traditional Actor-Critic (AC) algorithms in addressing the decision making problems of ventilation and sedative dosing in ICUs.

RESULTS

Results show that SAC is much more efficient than the traditional AC algorithm in terms of convergence rate and data utilization.

CONCLUSIONS

The SAC algorithm not only aims to cure patients in the long term, but also reduces the degree of deviation from the strategy applied by clinical doctors and thus improves the therapeutic effect.

摘要

背景

强化学习 (RL) 为解决医疗领域复杂的序贯决策问题提供了一种很有前途的技术。近年来,RL 在解决重症监护病房 (ICU) 中的决策问题方面取得了很大进展。然而,由于传统 RL 算法的目标是最大化长期奖励函数,因此学习过程中的探索可能会对患者产生致命的影响。因此,在治疗过程中也应该考虑短期目标,以保持患者的稳定。

方法

我们使用基于监督的演员-评论家 (SAC) RL 算法来解决这个问题,该算法将 RL 的长期目标导向特征与监督学习的短期目标相结合。我们评估了 SAC 和传统的演员-评论家 (AC) 算法在解决 ICU 中的通气和镇静剂剂量决策问题方面的差异。

结果

结果表明,SAC 在收敛速度和数据利用率方面比传统的 AC 算法效率更高。

结论

SAC 算法不仅旨在长期治疗患者,还降低了偏离临床医生应用策略的程度,从而提高了治疗效果。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验