• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过深度 Q 网络从医疗注册数据中学习动态治疗方案。

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network.

机构信息

Department of Electrical Engineering and Computer Engineering, Northeastern University, Boston, MA, 02115, USA.

Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, 53226, USA.

出版信息

Sci Rep. 2019 Feb 6;9(1):1495. doi: 10.1038/s41598-018-37142-0.

DOI:10.1038/s41598-018-37142-0
PMID:30728403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6365640/
Abstract

This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts' decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.

摘要

本文提出了一种基于深度强化学习(DRL)的框架,用于从观察性医学数据中估计最优的动态治疗方案。与现有的强化学习方法相比,该框架在处理高维动作和状态空间方面更加灵活和自适应,能够模拟异质疾病进展和治疗选择中的实际复杂性,旨在为医生和患者提供数据驱动的个性化决策建议。所提出的 DRL 框架包括(i)一个用于预测专家动作的监督学习步骤,以及(ii)一个用于估计动态治疗方案的长期价值函数的深度强化学习步骤。这两个步骤都依赖于深度神经网络。作为一个关键的动机示例,我们已经在来自国际骨髓移植研究中心(CIBMTR)注册数据库的数据集中实现了所提出的框架,重点关注移植后急性和慢性移植物抗宿主病的预防和治疗序列。在实验结果中,我们证明了在预测人类专家决策方面具有很高的准确性,以及基于 DRL 的动态治疗方案中的高预期奖励函数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/0e4440ba355f/41598_2018_37142_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/490a53987751/41598_2018_37142_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/12fb6d86eb74/41598_2018_37142_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/641762443b1c/41598_2018_37142_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/688bc7e31c3f/41598_2018_37142_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/b30767b8b8cb/41598_2018_37142_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/52f49b2016eb/41598_2018_37142_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/e0e27b1fa93a/41598_2018_37142_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/0e4440ba355f/41598_2018_37142_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/490a53987751/41598_2018_37142_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/12fb6d86eb74/41598_2018_37142_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/641762443b1c/41598_2018_37142_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/688bc7e31c3f/41598_2018_37142_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/b30767b8b8cb/41598_2018_37142_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/52f49b2016eb/41598_2018_37142_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/e0e27b1fa93a/41598_2018_37142_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d66/6365640/0e4440ba355f/41598_2018_37142_Fig7_HTML.jpg

相似文献

1
Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network.通过深度 Q 网络从医疗注册数据中学习动态治疗方案。
Sci Rep. 2019 Feb 6;9(1):1495. doi: 10.1038/s41598-018-37142-0.
2
Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data.基于医学登记数据的动态治疗方案的深度强化学习
Healthc Inform. 2017 Aug;2017:380-385. doi: 10.1109/ICHI.2017.45.
3
Deep reinforcement learning and its applications in medical imaging and radiation therapy: a survey.深度强化学习及其在医学影像和放射治疗中的应用:综述。
Phys Med Biol. 2022 Nov 11;67(22). doi: 10.1088/1361-6560/ac9cb3.
4
Deep reinforcement learning in medical imaging: A literature review.深度强化学习在医学成像中的应用:文献综述。
Med Image Anal. 2021 Oct;73:102193. doi: 10.1016/j.media.2021.102193. Epub 2021 Jul 27.
5
Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。
Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.
6
Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning.精准医学时代的工具:如何使用Q学习从队列和登记数据中制定高度个性化的治疗建议。
Am J Epidemiol. 2017 Jul 15;186(2):160-172. doi: 10.1093/aje/kwx027.
7
Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy.量子深度学习在肿瘤临床决策支持中的应用:自适应放疗的应用。
Sci Rep. 2021 Dec 7;11(1):23545. doi: 10.1038/s41598-021-02910-y.
8
Deep Reinforcement Learning on Autonomous Driving Policy With Auxiliary Critic Network.基于辅助评论家网络的自动驾驶策略深度强化学习
IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3680-3690. doi: 10.1109/TNNLS.2021.3116063. Epub 2023 Jul 6.
9
Dynamic sparse coding-based value estimation network for deep reinforcement learning.基于动态稀疏编码的深度强化学习价值估计网络。
Neural Netw. 2023 Nov;168:180-193. doi: 10.1016/j.neunet.2023.09.013. Epub 2023 Sep 11.
10
Evaluating the impact of reinforcement learning on automatic deep brain stimulation planning.评估强化学习对自动深部脑刺激规划的影响。
Int J Comput Assist Radiol Surg. 2024 Jun;19(6):995-1002. doi: 10.1007/s11548-024-03078-2. Epub 2024 Feb 27.

引用本文的文献

1
Reinforcement Learning and Its Clinical Applications Within Healthcare: A Systematic Review of Precision Medicine and Dynamic Treatment Regimes.强化学习及其在医疗保健领域的临床应用:精准医学与动态治疗方案的系统综述
Healthcare (Basel). 2025 Jul 19;13(14):1752. doi: 10.3390/healthcare13141752.
2
Chronic Kidney Disease-Mineral and Bone Disorder Management in 4D: The Case for Dynamic Treatment Regime Methods to Optimize Care.4D 模式下的慢性肾脏病 - 矿物质和骨异常管理:采用动态治疗方案方法优化护理的理由
Curr Osteoporos Rep. 2025 Mar 25;23(1):16. doi: 10.1007/s11914-025-00911-8.
3
The Applications of Machine Learning in the Management of Patients Undergoing Stem Cell Transplantation: Are We Ready?

本文引用的文献

1
Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。
Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.
2
Tools for the Precision Medicine Era: How to Develop Highly Personalized Treatment Recommendations From Cohort and Registry Data Using Q-Learning.精准医学时代的工具:如何使用Q学习从队列和登记数据中制定高度个性化的治疗建议。
Am J Epidemiol. 2017 Jul 15;186(2):160-172. doi: 10.1093/aje/kwx027.
3
Mastering the game of Go with deep neural networks and tree search.
机器学习在接受干细胞移植患者管理中的应用:我们准备好了吗?
Cancers (Basel). 2025 Jan 25;17(3):395. doi: 10.3390/cancers17030395.
4
Energy landscape analysis and time-series clustering analysis of patient state multistability related to rheumatoid arthritis drug treatment: The KURAMA cohort study.能量景观分析和与类风湿关节炎药物治疗相关的患者状态多稳定性的时间序列聚类分析:KURAMA 队列研究。
PLoS One. 2024 May 6;19(5):e0302308. doi: 10.1371/journal.pone.0302308. eCollection 2024.
5
Development and validation of a reinforcement learning model for ventilation control during emergence from general anesthesia.全身麻醉苏醒期通气控制强化学习模型的开发与验证
NPJ Digit Med. 2023 Aug 14;6(1):145. doi: 10.1038/s41746-023-00893-w.
6
A scoping review of studies using observational data to optimise dynamic treatment regimens.使用观察性数据优化动态治疗方案的研究的范围综述。
BMC Med Res Methodol. 2021 Feb 22;21(1):39. doi: 10.1186/s12874-021-01211-2.
用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
4
Microrandomized trials: An experimental design for developing just-in-time adaptive interventions.微随机试验:一种用于开发即时适应性干预措施的实验设计。
Health Psychol. 2015 Dec;34S(0):1220-8. doi: 10.1037/hea0000305.
5
New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes.用于估计最优动态治疗方案的新统计学习方法。
J Am Stat Assoc. 2015;110(510):583-598. doi: 10.1080/01621459.2014.937488.
6
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
7
Prophylaxis and treatment of GVHD: EBMT-ELN working group recommendations for a standardized practice.GVHD 的预防和治疗:EBMT-ELN 工作组关于标准化实践的建议。
Bone Marrow Transplant. 2014 Feb;49(2):168-73. doi: 10.1038/bmt.2013.107. Epub 2013 Jul 29.
8
A robust method for estimating optimal treatment regimes.一种估计最优治疗方案的稳健方法。
Biometrics. 2012 Dec;68(4):1010-8. doi: 10.1111/j.1541-0420.2012.01763.x. Epub 2012 May 2.
9
Reinforcement learning design for cancer clinical trials.强化学习在癌症临床试验中的设计。
Stat Med. 2009 Nov 20;28(26):3294-315. doi: 10.1002/sim.3720.
10
Demystifying optimal dynamic treatment regimes.揭开最优动态治疗方案的神秘面纱。
Biometrics. 2007 Jun;63(2):447-55. doi: 10.1111/j.1541-0420.2006.00686.x.