Weisman Dan, Sugarman Alanna, Huang Yue Ming, Gelberg Lillian, Ganz Patricia A, Comulada Warren Scott
UCLA Simulation Center, University of California, Los Angeles, Los Angeles, CA, United States.
David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States.
JMIR Form Res. 2025 Apr 17;9:e65670. doi: 10.2196/65670.
Standardized patients (SPs) prepare medical students for difficult conversations with patients. Despite their value, SP-based simulation training is constrained by available resources and competing clinical demands. Researchers are turning to artificial intelligence and large language models, such as generative pretrained transformers, to create communication training that incorporates virtual simulated patients (VSPs). GPT-4 is a large language model advance allowing developers to design virtual simulation scenarios using text-based prompts instead of relying on branching path simulations with prescripted dialogue. These nascent developmental practices have not taken root in the literature to guide other researchers in developing their own simulations.
This study aims to describe our developmental process and lessons learned for creating a GPT-4-driven VSP. We designed the VSP to help medical student learners rehearse discussing abnormal mammography results with a patient as a primary care physician (PCP). We aimed to assess GPT-4's ability to generate appropriate VSP responses to learners during spoken conversations and provide appropriate feedback on learner performance.
A research team comprised of physicians, a medical student, an educator, an SP program director, a learning experience designer, and a health care researcher conducted the study. A formative phase with in-depth knowledge user interviews informed development, followed by a development phase to create the virtual training module. The team conducted interviews with 5 medical students, 5 PCPs, and 5 breast cancer survivors. They then developed a VSP using simulation authoring software and provided the GPT-4-enabled VSP with an initial prompt consisting of a scenario description, emotional state, and expectations for learner dialogue. It was iteratively refined through an agile design process involving repeated cycles of testing, documenting issues, and revising the prompt. As an exploratory feature, the simulation used GPT-4 to provide written feedback to learners about their performance communicating with the VSP and their adherence to guidelines for difficult conversations.
In-depth interviews helped establish the appropriate timing, mode of communication, and protocol for conversations between PCPs and patients during the breast cancer screening process. The scenario simulated a telephone call between a physician and patient to discuss the abnormal results of a diagnostic mammogram that that indicated a need for a biopsy. Preliminary testing was promising. The VSP asked sensible questions about their mammography results and responded to learner inquiries using a voice replete with appropriate emotional inflections. GPT-4 generated performance feedback that successfully identified strengths and areas for improvement using relevant quotes from the learner-VSP conversation, but it occasionally misidentified learner adherence to communication protocols.
GPT-4 streamlined development and facilitated more dynamic, humanlike interactions between learners and the VSP compared to branching path simulations. For the next steps, we will pilot-test the VSP with medical students to evaluate its feasibility and acceptability.
标准化病人(SPs)帮助医学生为与患者进行困难对话做好准备。尽管它们具有价值,但基于标准化病人的模拟培训受到可用资源和相互竞争的临床需求的限制。研究人员正在转向人工智能和大语言模型,如生成式预训练变换器,以创建包含虚拟模拟病人(VSPs)的沟通培训。GPT-4是一种先进的大语言模型,使开发者能够使用基于文本的提示来设计虚拟模拟场景,而不是依赖带有预设对话的分支路径模拟。这些新兴的开发实践尚未在文献中扎根,无法指导其他研究人员开发自己的模拟。
本研究旨在描述我们创建由GPT-4驱动的虚拟模拟病人的开发过程和经验教训。我们设计虚拟模拟病人是为了帮助医学生学习者作为初级保健医生(PCP)排练与患者讨论异常乳房X光检查结果的过程。我们旨在评估GPT-4在口语对话中对学习者生成适当的虚拟模拟病人反应的能力,并对学习者的表现提供适当的反馈。
一个由医生、一名医学生、一名教育工作者、一名标准化病人项目主任、一名学习体验设计师和一名医疗保健研究人员组成的研究团队进行了这项研究。一个形成阶段,通过深入的知识用户访谈为开发提供信息,随后是一个创建虚拟培训模块的开发阶段。该团队采访了5名医学生、5名初级保健医生和5名乳腺癌幸存者。然后,他们使用模拟创作软件开发了一个虚拟模拟病人,并为启用GPT-4的虚拟模拟病人提供了一个初始提示,包括场景描述、情绪状态和对学习者对话的期望。通过一个敏捷设计过程进行迭代优化,该过程包括反复的测试、记录问题和修改提示。作为一个探索性特征,该模拟使用GPT-4为学习者提供关于他们与虚拟模拟病人沟通表现以及他们对困难对话指南遵守情况的书面反馈。
深入访谈有助于确定初级保健医生和患者在乳腺癌筛查过程中对话的适当时间、沟通方式和协议。该场景模拟了医生和患者之间的电话通话,以讨论诊断性乳房X光检查的异常结果,该结果表明需要进行活检。初步测试很有前景。虚拟模拟病人询问了关于他们乳房X光检查结果的合理问题,并使用充满适当情感变化的声音回答学习者的询问。GPT-4生成的表现反馈使用学习者与虚拟模拟病人对话中的相关引述成功识别了优势和改进领域,但它偶尔会错误识别学习者对沟通协议的遵守情况。
与分支路径模拟相比,GPT-4简化了开发过程,并促进了学习者与虚拟模拟病人之间更动态、更像人类的互动。对于下一步,我们将对医学生进行虚拟模拟病人的试点测试,以评估其可行性和可接受性。