Kakdas Yasar C, Kockara Sinan, Halic Tansel, Demirel Doga
Florida Polytechnic Univ., Dept. of Computer Science, Lakeland, FL, USA 33805.
Rice Univ., Dept. of Computer Science, Houston, TX, USA 77005.
IEEE Trans Learn Technol. 2024;17:1248-1260. doi: 10.1109/tlt.2024.3372508. Epub 2024 Mar 4.
This study presents a 3D medical simulation that employs reinforcement learning (RL) and interactive reinforcement learning (IRL) to teach and assess the procedure of donning and doffing personal protective equipment (PPE). The simulation is motivated by the need for effective, safe, and remote training techniques in medicine, particularly in light of the COVID-19 pandemic. The simulation has two modes: a tutorial mode and an assessment mode. In the tutorial mode, a computer-based, ill-trained RL agent utilizes RL to learn the correct sequence of donning the PPE by trial and error. This allows students to experience many outlier cases they might not encounter in an in-class educational model. In the assessment mode, an IRL-based method is used to evaluate how effective the participant is at correcting the mistakes performed by the RL agent. Each time the RL agent interacts with the environment and performs an action, the participants provide positive or negative feedback regarding the action taken. Following the assessment, participants receive a score based on the accuracy of their feedback and the time taken for the RL agent to learn the correct sequence. An experiment was conducted using two groups, each consisting of 10 participants. The first group received RL-assisted training for donning PPE, followed by an IRL-based assessment. Meanwhile, the second group observed a video featuring the RL agent demonstrating only the correct donning order without outlier cases, replicating traditional training, before undergoing the same assessment as the first group. Results showed that RL-assisted training with many outlier cases was more effective than traditional training with only regular cases. Moreover, combining RL with IRL significantly enhanced the participants' performance. Notably, 90% of the participants finished the assessment with perfect scores within three iterations. In contrast, only 10% of those who did not engage in RL-assisted training finished the assessment with a perfect score, highlighting the substantial impact of RL and IRL integration on participants' overall achievement.
本研究展示了一种三维医学模拟,该模拟采用强化学习(RL)和交互式强化学习(IRL)来教授和评估穿脱个人防护装备(PPE)的过程。鉴于COVID-19大流行,对医学领域有效、安全和远程培训技术的需求推动了该模拟的发展。该模拟有两种模式:教程模式和评估模式。在教程模式中,一个基于计算机的、训练不足的强化学习智能体利用强化学习通过试错来学习正确的PPE穿戴顺序。这使学生能够体验到他们在课堂教育模式中可能不会遇到的许多异常情况。在评估模式中,一种基于交互式强化学习的方法用于评估参与者纠正强化学习智能体所犯错误的效果。每次强化学习智能体与环境交互并执行一个动作时,参与者会对所采取的动作提供正面或负面反馈。评估结束后,参与者会根据其反馈的准确性以及强化学习智能体学习正确顺序所需的时间获得一个分数。使用两组进行了一项实验,每组由10名参与者组成。第一组接受了穿戴PPE的强化学习辅助训练,随后进行基于交互式强化学习的评估。与此同时,第二组观看了一段视频,视频中强化学习智能体仅展示了正确的穿戴顺序,没有异常情况,这是传统训练的方式,然后与第一组进行相同的评估。结果表明,包含许多异常情况的强化学习辅助训练比仅包含常规情况的传统训练更有效。此外,将强化学习与交互式强化学习相结合显著提高了参与者的表现。值得注意的是,90%的参与者在三次迭代内以满分完成了评估。相比之下,未参与强化学习辅助训练的参与者中只有10%以满分完成了评估,这凸显了强化学习与交互式强化学习整合对参与者整体成绩的重大影响。