From Jump Simulation (W.F.B., T.J.L., M.J.M., J.L.F, D.M.K., K.M.M., R.A.E.-A., M.A.); OSF Healthcare (W.F.B., T.J.L., M.J.M., J.L.F., J.S.M., J.T.T., D.M.K., K.M.M., R.A.E.-A., D.N.M., M.A.); University of Illinois College of Medicine at Peoria (W.F.B., T.J.L., M.J.M., J.T.T., D.N.M., M.A.), Peoria, IL; The Institute for Creative Technologies (T.B.T.), Keck School of Medicine, University of Southern California, Los Angeles, CA; and Uniformed Services University (T.B.T.), Bethesda, MD.
Simul Healthc. 2019 Aug;14(4):241-250. doi: 10.1097/SIH.0000000000000373.
High-value care (HVC) suggests that good history taking and physical examination should lead to risk stratification that drives the use or withholding of diagnostic testing. This study describes the development of a series of virtual standardized patient (VSP) cases and provides preliminary evidence that supports their ability to provide experiential learning in HVC.
This pilot study used VSPs, or natural language processing-based patient avatars, within the USC Standard Patient platform. Faculty consensus was used to develop the cases, including the optimal diagnostic testing strategies, treatment options, and scored content areas. First-year resident physician learners experienced two 90-minute didactic sessions before completing the cases in a computer laboratory, using typed text to interview the avatar for history taking, then completing physical examination, differential diagnosis, diagnostic testing, and treatment modules for each case. Learners chose a primary and 2 alternative "possible" diagnoses from a list of 6 to 7 choices, diagnostic testing options from an extensive list, and treatments from a brief list ranging from 6 to 9 choices. For the history-taking module, both faculty and the platform scored the learners, and faculty assessed the appropriateness of avatar responses. Four randomly selected learner-avatar interview transcripts for each case were double rated by faculty for interrater reliability calculations. Intraclass correlations were calculated for interrater reliability, and Spearman ρ was used to determine the correlation between the platform and faculty ranking of learners' history-taking scores.
Eight VSP cases were experienced by 14 learners. Investigators reviewed 112 transcripts (4646 learner query-avatar responses). Interrater reliability means were 0.87 for learner query scoring and 0.83 for avatar response. Mean learner success for history taking was scored by the faculty at 57% and by the platform at 51% (ρ correlation of learner rankings = 0.80, P = 0.02). The mean avatar appropriate response rate was 85.6% for all cases. Learners chose the correct diagnosis within their 3 choices 82% of the time, ordered a median (interquartile range) of 2 (2) unnecessary tests and completed 56% of optimal treatments.
Our avatar appropriate response rate was similar to past work using similar platforms. The simulations give detailed insights into the thoroughness of learner history taking and testing choices and with further refinement should support learning in HVC.
高价值医疗(HVC)表明,良好的病史采集和体格检查应该能够进行风险分层,从而推动诊断检测的使用或保留。本研究描述了一系列虚拟标准化患者(VSP)病例的开发,并提供了初步证据,支持它们在 HVC 中提供体验式学习的能力。
这项初步研究使用了 USC 标准患者平台中的 VSP 或基于自然语言处理的患者虚拟形象。通过教师共识开发病例,包括最佳诊断测试策略、治疗选择和评分内容领域。一年级住院医师学习者在计算机实验室完成病例之前,先参加两个 90 分钟的理论课程,使用键入的文本与虚拟形象进行病史采集,然后完成体格检查、鉴别诊断、诊断测试和每个病例的治疗模块。学习者从 6 到 7 个选项中选择一个主要的和 2 个“可能”的诊断,从广泛的列表中选择诊断测试选项,并从 6 到 9 个选项的简短列表中选择治疗方法。对于病史采集模块,教师和平台都对学习者进行评分,教师评估虚拟形象的反应是否恰当。每个病例随机选择 4 份学习者-虚拟形象访谈记录由教师进行双评分,以计算评分者间可靠性的计算。计算了组内相关系数以评估评分者间的可靠性,并使用 Spearman ρ 确定学习者病史采集评分的平台和教师评分之间的相关性。
14 名学习者体验了 8 个 VSP 病例。研究人员审查了 112 份记录(4646 份学习者查询-虚拟形象响应)。学习者查询评分的评分者间均值为 0.87,虚拟形象响应的评分者间均值为 0.83。教师对学习者的病史采集评分的平均成功率为 57%,平台评分为 51%(学习者排名的ρ相关性=0.80,P=0.02)。所有病例的虚拟形象的平均恰当反应率为 85.6%。学习者在 3 个选择中正确选择诊断的比例为 82%,平均(中位数(四分位距))选择 2(2)项不必要的检查,并完成 56%的最佳治疗方案。
我们的虚拟形象恰当反应率与使用类似平台的过去研究相似。这些模拟提供了对学习者病史采集和检测选择的彻底性的深入了解,并在进一步改进后,应该支持 HVC 中的学习。