Gutiérrez Maquilón Rodrigo, Uhl Jakob, Schrom-Feiertag Helmut, Tscheligi Manfred
Center for Technology Experience, AIT - Austrian Institute of Technology, Vienna, Austria.
Department of Artificial Intelligence and Human Interfaces, Paris Lodron University of Salzburg, Salzburg, Austria.
JMIR Form Res. 2024 Dec 11;8:e58623. doi: 10.2196/58623.
Training in social-verbal interactions is crucial for medical first responders (MFRs) to assess a patient's condition and perform urgent treatment during emergency medical service administration. Integrating conversational agents (CAs) in virtual patients (VPs), that is, digital simulations, is a cost-effective alternative to resource-intensive human role-playing. There is moderate evidence that CAs improve communication skills more effectively when used with instructional interventions. However, more recent GPT-based artificial intelligence (AI) produces richer, more diverse, and more natural responses than previous CAs and has control of prosodic voice qualities like pitch and duration. These functionalities have the potential to better match the interaction expectations of MFRs regarding habitability.
We aimed to study how the integration of GPT-based AI in a mixed reality (MR)-VP could support communication training of MFRs.
We developed an MR simulation of a traffic accident with a VP. ChatGPT (OpenAI) was integrated into the VP and prompted with verified characteristics of accident victims. MFRs (N=24) were instructed on how to interact with the MR scenario. After assessing and treating the VP, the MFRs were administered the Mean Opinion Scale-Expanded, version 2, and the Subjective Assessment of Speech System Interfaces questionnaires to study their perception of the voice quality and the usability of the voice interactions, respectively. Open-ended questions were asked after completing the questionnaires. The observed and logged interactions with the VP, descriptive statistics of the questionnaires, and the output of the open-ended questions are reported.
The usability assessment of the VP resulted in moderate positive ratings, especially in habitability (median 4.25, IQR 4-4.81) and likeability (median 4.50, IQR 3.97-5.91). Interactions were negatively affected by the approximately 3-second latency of the responses. MFRs acknowledged the naturalness of determining the physiological states of the VP through verbal communication, for example, with questions such as "Where does it hurt?" However, the question-answer dynamic in the verbal exchange with the VP and the lack of the VP's ability to start the verbal exchange were noticed. Noteworthy insights highlighted the potential of domain-knowledge prompt engineering to steer the actions of MFRs for effective training.
Generative AI in VPs facilitates MFRs' training but continues to rely on instructions for effective verbal interactions. Therefore, the capabilities of the GPT-VP and a training protocol need to be communicated to trainees. Future interactions should implement triggers based on keyword recognition, the VP pointing to the hurting area, conversational turn-taking techniques, and add the ability for the VP to start a verbal exchange. Furthermore, a local AI server, chunk processing, and lowering the audio resolution of the VP's voice could ameliorate the delay in response and allay privacy concerns. Prompting could be used in future studies to create a virtual MFR capable of assisting trainees.
社交语言互动训练对于医疗急救人员(MFRs)在紧急医疗服务管理过程中评估患者病情并进行紧急治疗至关重要。将对话代理(CAs)集成到虚拟患者(VPs)中,即数字模拟,是一种成本效益高的替代方案,可替代资源密集型的真人角色扮演。有中等证据表明,当与教学干预措施一起使用时,CAs能更有效地提高沟通技巧。然而,最近基于GPT的人工智能(AI)比以前的CAs能产生更丰富、更多样化和更自然的回答,并且能够控制音高和时长等韵律语音特征。这些功能有可能更好地满足MFRs在适居性方面的互动期望。
我们旨在研究基于GPT的AI集成到混合现实(MR)-VP中如何支持MFRs的沟通训练。
我们开发了一个带有VP的交通事故MR模拟。ChatGPT(OpenAI)被集成到VP中,并根据事故受害者的已验证特征进行提示。指导24名MFRs如何与MR场景进行交互。在对VP进行评估和治疗后,对MFRs进行了扩展版平均意见量表2以及语音系统界面主观评估问卷的测试,分别研究他们对语音质量的感知和语音交互的可用性。完成问卷后提出了开放式问题。报告了与VP的观察和记录的交互、问卷的描述性统计以及开放式问题的输出。
VP的可用性评估得到了中等程度的积极评分,特别是在适居性(中位数4.25,四分位距4 - 4.81)和可爱度(中位数4.50,四分位距3.97 - 5.91)方面。交互受到约3秒响应延迟的负面影响。MFRs认可通过言语交流确定VP生理状态的自然性,例如通过“哪里疼?”这样的问题。然而,注意到与VP进行言语交流时的问答动态以及VP缺乏发起言语交流的能力。值得注意的见解强调了领域知识提示工程在引导MFRs行动以进行有效训练方面的潜力。
VP中的生成式AI有助于MFRs的训练,但在有效的言语交互方面仍依赖于指导。因此,需要将GPT-VP的功能和训练协议传达给受训人员。未来的交互应基于关键词识别、VP指向疼痛区域、对话轮流技术实现触发功能,并增加VP发起言语交流的能力。此外,本地AI服务器、块处理以及降低VP语音的音频分辨率可以改善响应延迟并减轻隐私担忧。提示可用于未来的研究中以创建一个能够协助受训人员的虚拟MFR。