• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于监督演员-评论员的强化学习算法在重症监护病房智能机械通气和镇静药物剂量调节中的应用

Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.

机构信息

School of Data and Computer Science, Sun Yat-Sen University, Guangzhou, 510015, China.

School of Computer Science and Technology, Dalian University of Technology, Dalian, 110621, China.

出版信息

BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):124. doi: 10.1186/s12911-020-1120-5.

DOI:10.1186/s12911-020-1120-5
PMID:32646412
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7344039/
Abstract

BACKGROUND

Reinforcement learning (RL) provides a promising technique to solve complex sequential decision making problems in healthcare domains. Recent years have seen a great progress of applying RL in addressing decision-making problems in Intensive Care Units (ICUs). However, since the goal of traditional RL algorithms is to maximize a long-term reward function, exploration in the learning process may have a fatal impact on the patient. As such, a short-term goal should also be considered to keep the patient stable during the treating process.

METHODS

We use a Supervised-Actor-Critic (SAC) RL algorithm to address this problem by combining the long-term goal-oriented characteristics of RL with the short-term goal of supervised learning. We evaluate the differences between SAC and traditional Actor-Critic (AC) algorithms in addressing the decision making problems of ventilation and sedative dosing in ICUs.

RESULTS

Results show that SAC is much more efficient than the traditional AC algorithm in terms of convergence rate and data utilization.

CONCLUSIONS

The SAC algorithm not only aims to cure patients in the long term, but also reduces the degree of deviation from the strategy applied by clinical doctors and thus improves the therapeutic effect.

摘要

背景

强化学习 (RL) 为解决医疗领域复杂的序贯决策问题提供了一种很有前途的技术。近年来,RL 在解决重症监护病房 (ICU) 中的决策问题方面取得了很大进展。然而,由于传统 RL 算法的目标是最大化长期奖励函数,因此学习过程中的探索可能会对患者产生致命的影响。因此,在治疗过程中也应该考虑短期目标,以保持患者的稳定。

方法

我们使用基于监督的演员-评论家 (SAC) RL 算法来解决这个问题,该算法将 RL 的长期目标导向特征与监督学习的短期目标相结合。我们评估了 SAC 和传统的演员-评论家 (AC) 算法在解决 ICU 中的通气和镇静剂剂量决策问题方面的差异。

结果

结果表明,SAC 在收敛速度和数据利用率方面比传统的 AC 算法效率更高。

结论

SAC 算法不仅旨在长期治疗患者,还降低了偏离临床医生应用策略的程度,从而提高了治疗效果。

相似文献

1
Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.基于监督演员-评论员的强化学习算法在重症监护病房智能机械通气和镇静药物剂量调节中的应用
BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):124. doi: 10.1186/s12911-020-1120-5.
2
Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.重症监护病房中智能机械通气和镇静药物剂量的逆强化学习。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):57. doi: 10.1186/s12911-019-0763-6.
3
Continuous action deep reinforcement learning for propofol dosing during general anesthesia.全身麻醉期间丙泊酚给药的连续动作深度强化学习
Artif Intell Med. 2022 Jan;123:102227. doi: 10.1016/j.artmed.2021.102227. Epub 2021 Dec 2.
4
Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning.具有分层模型学习与规划的高效行动者-评论家算法
Comput Intell Neurosci. 2016;2016:4824072. doi: 10.1155/2016/4824072. Epub 2016 Oct 3.
5
Stochastic Integrated Actor-Critic for Deep Reinforcement Learning.用于深度强化学习的随机集成演员-评论家算法
IEEE Trans Neural Netw Learn Syst. 2024 May;35(5):6654-6666. doi: 10.1109/TNNLS.2022.3212273. Epub 2024 May 2.
6
Meta attention for Off-Policy Actor-Critic.用于离策略演员-评论家的元注意力机制
Neural Netw. 2023 Jun;163:86-96. doi: 10.1016/j.neunet.2023.03.024. Epub 2023 Mar 28.
7
End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.基于软动作 - 批评家的端到端 AUV 运动规划方法。
Sensors (Basel). 2021 Sep 1;21(17):5893. doi: 10.3390/s21175893.
8
Reinforcement learning for automatic quadrilateral mesh generation: A soft actor-critic approach.用于自动四边形网格生成的强化学习:一种软演员-评论家方法。
Neural Netw. 2023 Jan;157:288-304. doi: 10.1016/j.neunet.2022.10.022. Epub 2022 Oct 29.
9
Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards.使用人类生成的奖励训练用于手臂运动的 Actor-Critic 强化学习控制器。
IEEE Trans Neural Syst Rehabil Eng. 2017 Oct;25(10):1892-1905. doi: 10.1109/TNSRE.2017.2700395. Epub 2017 May 2.
10
An actor-critic framework based on deep reinforcement learning for addressing flexible job shop scheduling problems.一种基于深度强化学习的演员-评论家框架,用于解决柔性作业车间调度问题。
Math Biosci Eng. 2024 Jan;21(1):1445-1471. doi: 10.3934/mbe.2024062. Epub 2022 Dec 28.

引用本文的文献

1
Learning optimal treatment strategies for intraoperative hypotension using deep reinforcement learning.使用深度强化学习学习术中低血压的最佳治疗策略。
ArXiv. 2025 May 27:arXiv:2505.21596v1.
2
Clinical Applications of Machine Learning.机器学习的临床应用
Ann Surg Open. 2024 Apr 18;5(2):e423. doi: 10.1097/AS9.0000000000000423. eCollection 2024 Jun.
3
Reinforcement learning evaluation of treatment policies for patients with hepatitis C virus.基于强化学习的丙型肝炎病毒患者治疗方案评估。

本文引用的文献

1
Incorporating causal factors into reinforcement learning for dynamic treatment regimes in HIV.将因果因素纳入 HIV 动态治疗方案的强化学习中。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):60. doi: 10.1186/s12911-019-0755-6.
2
Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.重症监护病房中智能机械通气和镇静药物剂量的逆强化学习。
BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):57. doi: 10.1186/s12911-019-0763-6.
3
Optimal adaptive control of drug dosing using integral reinforcement learning.
BMC Med Inform Decis Mak. 2022 Mar 11;22(1):63. doi: 10.1186/s12911-022-01789-7.
4
State of the Art of Machine Learning-Enabled Clinical Decision Support in Intensive Care Units: Literature Review.重症监护病房中基于机器学习的临床决策支持技术现状:文献综述
JMIR Med Inform. 2022 Mar 3;10(3):e28781. doi: 10.2196/28781.
5
Patient-Specific Sedation Management via Deep Reinforcement Learning.通过深度强化学习实现个性化镇静管理
Front Digit Health. 2021 Mar 31;3:608893. doi: 10.3389/fdgth.2021.608893. eCollection 2021.
6
Reinforcement Learning in Neurocritical and Neurosurgical Care: Principles and Possible Applications.神经危重症与神经外科学中的强化学习:原则与可能的应用。
Comput Math Methods Med. 2021 Feb 22;2021:6657119. doi: 10.1155/2021/6657119. eCollection 2021.
7
Reinforcement learning in surgery.手术中的强化学习。
Surgery. 2021 Jul;170(1):329-332. doi: 10.1016/j.surg.2020.11.040. Epub 2021 Jan 9.
基于积分强化学习的药物剂量最优自适应控制。
Math Biosci. 2019 Mar;309:131-142. doi: 10.1016/j.mbs.2019.01.012. Epub 2019 Feb 5.
4
The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care.人工智能临床医生学习重症监护中脓毒症的最佳治疗策略。
Nat Med. 2018 Nov;24(11):1716-1720. doi: 10.1038/s41591-018-0213-5. Epub 2018 Oct 22.
5
Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。
Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.
6
Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach.从次优临床实例中得出最佳药物剂量:一种深度强化学习方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:2978-2981. doi: 10.1109/EMBC.2016.7591355.
7
Seizure Control in a Computational Model Using a Reinforcement Learning Stimulation Paradigm.使用强化学习刺激范式的计算模型中的癫痫控制。
Int J Neural Syst. 2017 Nov;27(7):1750012. doi: 10.1142/S0129065717500125. Epub 2016 Nov 2.
8
Machine Learning and Decision Support in Critical Care.重症监护中的机器学习与决策支持
Proc IEEE Inst Electr Electron Eng. 2016 Feb;104(2):444-466. doi: 10.1109/JPROC.2015.2501978. Epub 2016 Jan 25.
9
MIMIC-III, a freely accessible critical care database.MIMIC-III,一个免费获取的重症监护数据库。
Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.
10
The use of reinforcement learning algorithms to meet the challenges of an artificial pancreas.使用强化学习算法应对人工胰腺的挑战。
Expert Rev Med Devices. 2013 Sep;10(5):661-73. doi: 10.1586/17434440.2013.827515. Epub 2013 Aug 23.