通过深度 Q 网络从医疗注册数据中学习动态治疗方案。

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network.

机构信息

Department of Electrical Engineering and Computer Engineering, Northeastern University, Boston, MA, 02115, USA.

Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, 53226, USA.

出版信息

Sci Rep. 2019 Feb 6;9(1):1495. doi: 10.1038/s41598-018-37142-0.

DOI:10.1038/s41598-018-37142-0

PMID:30728403

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6365640/

Abstract

This paper presents the deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data. This framework is more flexible and adaptive for high dimensional action and state spaces than existing reinforcement learning methods to model real-life complexity in heterogeneous disease progression and treatment choices, with the goal of providing doctors and patients the data-driven personalized decision recommendations. The proposed DRL framework comprises (i) a supervised learning step to predict expert actions, and (ii) a deep reinforcement learning step to estimate the long-term value function of Dynamic Treatment Regimes. Both steps depend on deep neural networks. As a key motivational example, we have implemented the proposed framework on a data set from the Center for International Bone Marrow Transplant Research (CIBMTR) registry database, focusing on the sequence of prevention and treatments for acute and chronic graft versus host disease after transplantation. In the experimental results, we have demonstrated promising accuracy in predicting human experts' decisions, as well as the high expected reward function in the DRL-based dynamic treatment regimes.

摘要

本文提出了一种基于深度强化学习（DRL）的框架，用于从观察性医学数据中估计最优的动态治疗方案。与现有的强化学习方法相比，该框架在处理高维动作和状态空间方面更加灵活和自适应，能够模拟异质疾病进展和治疗选择中的实际复杂性，旨在为医生和患者提供数据驱动的个性化决策建议。所提出的 DRL 框架包括（i）一个用于预测专家动作的监督学习步骤，以及（ii）一个用于估计动态治疗方案的长期价值函数的深度强化学习步骤。这两个步骤都依赖于深度神经网络。作为一个关键的动机示例，我们已经在来自国际骨髓移植研究中心（CIBMTR）注册数据库的数据集中实现了所提出的框架，重点关注移植后急性和慢性移植物抗宿主病的预防和治疗序列。在实验结果中，我们证明了在预测人类专家决策方面具有很高的准确性，以及基于 DRL 的动态治疗方案中的高预期奖励函数。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过深度 Q 网络从医疗注册数据中学习动态治疗方案。

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

通过深度 Q 网络从医疗注册数据中学习动态治疗方案。

Learning the Dynamic Treatment Regimes from Medical Registry Data through Deep Q-network.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献