Institute of Visual Informatics, National University of Malaysia, Bangi, Selangor, Malaysia.
Institute of Visual Informatics, National University of Malaysia, Bangi, Selangor, Malaysia.
Artif Intell Med. 2024 Aug;154:102901. doi: 10.1016/j.artmed.2024.102901. Epub 2024 Jun 4.
There is evidence that reducing modifiable risk factors and strengthening medical and health interventions can reduce early mortality and economic losses from non-communicable diseases (NCDs). Machine learning (ML) algorithms have been successfully applied to preventing and controlling NCDs. Reinforcement learning (RL) is the most promising of these approaches because of its ability to dynamically adapt interventions to NCD disease progression and its commitment to achieving long-term intervention goals. This paper reviews the preferred algorithms, data sources, design details, and obstacles to clinical application in existing studies to facilitate the early application of RL algorithms in clinical practice research for NCD interventions. We screened 40 relevant papers for quantitative and qualitative analysis using the PRISMA review flow diagram. The results show that researchers tend to use Deep Q-Network (DQN) and Actor-Critic as well as their improved or hybrid algorithms to train and validate RL models on retrospective datasets. Often, the patient's physical condition is the main defining parameter of the state space, while interventions are the main defining parameter of the action space. Mostly, changes in the patient's physical condition are used as a basis for immediate rewards to the agent. Various attempts have been made to address the challenges to clinical application, and several approaches have been proposed from existing research. However, as there is currently no universally accepted solution, the use of RL algorithms in clinical practice for NCD interventions necessitates more comprehensive responses to the issues addressed in this paper, which are safety, interpretability, training efficiency, and the technical aspect of exploitation and exploration in RL algorithms.
有证据表明,减少可改变的风险因素和加强医疗干预,可以降低非传染性疾病(NCDs)的早逝率和经济损失。机器学习(ML)算法已成功应用于 NCD 的预防和控制。强化学习(RL)是这些方法中最有前途的方法,因为它能够动态地使干预措施适应 NCD 疾病的进展,并致力于实现长期干预目标。本文综述了现有研究中预防和控制 NCD 干预措施的首选算法、数据源、设计细节和临床应用障碍,以促进 RL 算法在 NCD 干预临床实践研究中的早期应用。我们使用 PRISMA 审查流程图对 40 篇相关论文进行了定量和定性分析。结果表明,研究人员倾向于使用深度 Q 网络(DQN)和 Actor-Critic 及其改进或混合算法,在回顾性数据集上训练和验证 RL 模型。通常,患者的身体状况是状态空间的主要定义参数,而干预措施是动作空间的主要定义参数。大多数情况下,患者身体状况的变化被用作代理即时奖励的基础。已经尝试解决临床应用的挑战,并且从现有研究中提出了几种方法。然而,由于目前没有普遍接受的解决方案,RL 算法在 NCD 干预的临床实践中的应用需要对本文所解决的问题做出更全面的回应,这些问题包括安全性、可解释性、训练效率以及 RL 算法中的开发和探索技术方面。