全身麻醉期间丙泊酚给药的连续动作深度强化学习

Continuous action deep reinforcement learning for propofol dosing during general anesthesia.

作者信息

Schamberg Gabriel, Badgeley Marcus, Meschede-Krasa Benyamin, Kwon Ohyoon, Brown Emery N

机构信息

Picower Institute for Learning and Memory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Tempus, Chicago, IL 60654, USA.

出版信息

Artif Intell Med. 2022 Jan;123:102227. doi: 10.1016/j.artmed.2021.102227. Epub 2021 Dec 2.

DOI:10.1016/j.artmed.2021.102227

PMID:34998516

Abstract

PURPOSE

Anesthesiologists simultaneously manage several aspects of patient care during general anesthesia. Automating administration of hypnotic agents could enable more precise control of a patient's level of unconsciousness and enable anesthesiologists to focus on the most critical aspects of patient care. Reinforcement learning (RL) algorithms can be used to fit a mapping from patient state to a medication regimen. These algorithms can learn complex control policies that, when paired with modern techniques for promoting model interpretability, offer a promising approach for developing a clinically viable system for automated anesthestic drug delivery.

METHODS

We expand on our prior work applying deep RL to automated anesthetic dosing by now using a continuous-action model based on the actor-critic RL paradigm. The proposed RL agent is composed of a policy network that maps observed anesthetic states to a continuous probability density over propofol-infusion rates and a value network that estimates the favorability of observed states. We train and test three versions of the RL agent using varied reward functions. The agent is trained using simulated pharmacokinetic/pharmacodynamic models with randomized parameters to ensure robustness to patient variability. The model is tested on simulations and retrospectively on nine general anesthesia cases collected in the operating room. We utilize Shapley additive explanations to gain an understanding of the factors with the greatest influence over the agent's decision-making.

RESULTS

The deep RL agent significantly outperformed a proportional-integral-derivative controller (median episode median absolute performance error 1.9% ± 1.8 and 3.1% ± 1.1). The model that was rewarded for minimizing total doses performed the best across simulated patient demographics (median episode median performance error 1.1% ± 0.5). When run on real-world clinical datasets, the agent recommended doses that were consistent with those administered by the anesthesiologist.

CONCLUSIONS

The proposed approach marks the first fully continuous deep RL algorithm for automating anesthestic drug dosing. The reward function used by the RL training algorithm can be flexibly designed for desirable practices (e.g. use less anesthetic) and bolstered performances. Through careful analysis of the learned policies, techniques for interpreting dosing decisions, and testing on clinical data, we confirm that the agent's anesthetic dosing is consistent with our understanding of best-practices in anesthesia care.

摘要

目的

麻醉医生在全身麻醉期间要同时管理患者护理的多个方面。自动化催眠药物给药能够更精确地控制患者的意识水平，并使麻醉医生能够专注于患者护理中最关键的方面。强化学习（RL）算法可用于拟合从患者状态到药物治疗方案的映射。这些算法可以学习复杂的控制策略，当与促进模型可解释性的现代技术相结合时，为开发临床上可行的自动麻醉药物输送系统提供了一种很有前景的方法。

方法

我们在先前将深度强化学习应用于自动麻醉给药的工作基础上进行拓展，现在使用基于演员 - 评论家强化学习范式的连续动作模型。所提出的强化学习智能体由一个策略网络和一个价值网络组成，策略网络将观察到的麻醉状态映射到丙泊酚输注速率的连续概率密度上，价值网络估计观察到的状态的有利程度。我们使用不同的奖励函数训练和测试强化学习智能体的三个版本。使用具有随机参数的模拟药代动力学/药效学模型对智能体进行训练，以确保对患者变异性具有鲁棒性。该模型在模拟中进行测试，并对在手术室收集的9例全身麻醉病例进行回顾性测试。我们利用夏普利加法解释来了解对智能体决策影响最大的因素。

结果

深度强化学习智能体的表现明显优于比例积分微分控制器（中位数情节中位数绝对性能误差分别为1.9%±1.8和3.1%±1.1）。因使总剂量最小化而获得奖励的模型在模拟患者人群中表现最佳（中位数情节中位数性能误差为1.1%±0.5）。在真实世界临床数据集上运行时，智能体推荐的剂量与麻醉医生给药的剂量一致。

结论

所提出的方法标志着首个用于自动化麻醉药物给药的完全连续深度强化学习算法。强化学习训练算法使用的奖励函数可以灵活设计以实现理想做法（例如使用更少的麻醉剂）并提升性能。通过对学习到的策略进行仔细分析、解释给药决策的技术以及对临床数据的测试，我们确认智能体的麻醉给药与我们对麻醉护理最佳实践的理解一致。

相似文献

Continuous action deep reinforcement learning for propofol dosing during general anesthesia.全身麻醉期间丙泊酚给药的连续动作深度强化学习

Artif Intell Med. 2022 Jan;123:102227. doi: 10.1016/j.artmed.2021.102227. Epub 2021 Dec 2.

Model enhanced reinforcement learning to enable precision dosing: A theoretical case study with dosing of propofol.模型增强强化学习以实现精准剂量给药：以丙泊酚给药为例的理论案例研究。

CPT Pharmacometrics Syst Pharmacol. 2022 Nov;11(11):1497-1510. doi: 10.1002/psp4.12858. Epub 2022 Sep 30.

Optimizing intraoperative administration of propofol, remifentanil, and fentanyl through pharmacokinetic and pharmacodynamic simulations to increase the postoperative duration of analgesia.通过药代动力学和药效学模拟优化丙泊酚、瑞芬太尼和芬太尼的术中给药，以延长术后镇痛持续时间。

J Clin Monit Comput. 2019 Dec;33(6):959-971. doi: 10.1007/s10877-019-00298-9. Epub 2019 Mar 12.

Model-Informed Reinforcement Learning for Enabling Precision Dosing Via Adaptive Dosing.通过自适应给药实现精准剂量的模型引导强化学习。

Clin Pharmacol Ther. 2024 Sep;116(3):619-636. doi: 10.1002/cpt.3356. Epub 2024 Jul 11.

Inverse reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.重症监护病房中智能机械通气和镇静药物剂量的逆强化学习。

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):57. doi: 10.1186/s12911-019-0763-6.

Towards Real-World Applications of Personalized Anesthesia Using Policy Constraint Q Learning for Propofol Infusion Control.基于策略约束Q学习的丙泊酚输注控制在个性化麻醉实际应用中的探索

IEEE J Biomed Health Inform. 2023 Oct 2;PP. doi: 10.1109/JBHI.2023.3321099.

Optimal adaptive control of drug dosing using integral reinforcement learning.基于积分强化学习的药物剂量最优自适应控制。

Math Biosci. 2019 Mar;309:131-142. doi: 10.1016/j.mbs.2019.01.012. Epub 2019 Feb 5.

Self-Supervised Discovering of Interpretable Features for Reinforcement Learning.基于自监督学习的强化学习可解释特征发现。

IEEE Trans Pattern Anal Mach Intell. 2022 May;44(5):2712-2724. doi: 10.1109/TPAMI.2020.3037898. Epub 2022 Apr 1.

Supervised-actor-critic reinforcement learning for intelligent mechanical ventilation and sedative dosing in intensive care units.基于监督演员-评论员的强化学习算法在重症监护病房智能机械通气和镇静药物剂量调节中的应用

BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):124. doi: 10.1186/s12911-020-1120-5.

A Hybrid Online Off-Policy Reinforcement Learning Agent Framework Supported by Transformers.基于 Transformer 的混合在线非策略强化学习代理框架。

Int J Neural Syst. 2023 Dec;33(12):2350065. doi: 10.1142/S012906572350065X. Epub 2023 Oct 20.

引用本文的文献

The future of target-controlled infusion and new pharmacokinetic models.靶控输注与新的药代动力学模型的未来。

Curr Opin Anaesthesiol. 2025 May 26;38(4):375-81. doi: 10.1097/ACO.0000000000001529.

Deep reinforcement learning for multi-targets propofol dosing.用于多目标丙泊酚给药的深度强化学习

J Clin Monit Comput. 2025 Jun;39(3):613-623. doi: 10.1007/s10877-025-01269-z. Epub 2025 Mar 6.

Comparison of time-series models for predicting physiological metrics under sedation.用于预测镇静状态下生理指标的时间序列模型比较。

J Clin Monit Comput. 2025 Jun;39(3):595-605. doi: 10.1007/s10877-024-01237-z. Epub 2024 Oct 29.

Application of Machine Learning in Predicting Perioperative Outcomes in Patients with Cancer: A Narrative Review for Clinicians.机器学习在预测癌症患者围手术期结局中的应用：临床医生的叙述性综述。

Curr Oncol. 2024 May 11;31(5):2727-2747. doi: 10.3390/curroncol31050207.

Applications of artificial intelligence in anesthesia: A systematic review.人工智能在麻醉中的应用：一项系统综述。

Saudi J Anaesth. 2024 Apr-Jun;18(2):249-256. doi: 10.4103/sja.sja_955_23. Epub 2024 Mar 14.

Use of Artificial Intelligence in Improving Outcomes in Heart Disease: A Scientific Statement From the American Heart Association.人工智能在改善心脏病治疗效果中的应用：美国心脏协会的科学声明。

Circulation. 2024 Apr 2;149(14):e1028-e1050. doi: 10.1161/CIR.0000000000001201. Epub 2024 Feb 28.

Development and validation of a reinforcement learning model for ventilation control during emergence from general anesthesia.全身麻醉苏醒期通气控制强化学习模型的开发与验证

NPJ Digit Med. 2023 Aug 14;6(1):145. doi: 10.1038/s41746-023-00893-w.

A value-based deep reinforcement learning model with human expertise in optimal treatment of sepsis.一种基于价值的深度强化学习模型，具备脓毒症最佳治疗方面的人类专业知识。

NPJ Digit Med. 2023 Feb 2;6(1):15. doi: 10.1038/s41746-023-00755-5.

Artificial intelligence in perioperative medicine: a narrative review.人工智能在围手术期医学中的应用：叙述性综述。

Korean J Anesthesiol. 2022 Jun;75(3):202-215. doi: 10.4097/kja.22157. Epub 2022 Mar 29.

Predicting anesthetic infusion events using machine learning.利用机器学习预测麻醉输注事件。

Sci Rep. 2021 Dec 8;11(1):23648. doi: 10.1038/s41598-021-03112-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

全身麻醉期间丙泊酚给药的连续动作深度强化学习

Continuous action deep reinforcement learning for propofol dosing during general anesthesia.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献