M.S. Ryan is associate professor and assistant dean, Clinical Medical Education, Department of Pediatrics, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-3266-9289 .
A.R. Khan is associate professor, director, Doctoring and Clinical Skills course, and clerkship director, Department of Internal Medicine, University of Illinois College of Medicine, Chicago, Illinois; ORCID: https://orcid.org/0000-0002-2306-4643 .
Acad Med. 2022 Apr 1;97(4):544-551. doi: 10.1097/ACM.0000000000004222.
In undergraduate medical education (UME), competency-based medical education has been operationalized through the 13 Core Entrustable Professional Activities for Entering Residency (Core EPAs). Direct observation in the workplace using rigorous, valid, reliable measures is required to inform summative decisions about graduates' readiness for residency. The purpose of this study is to investigate the validity evidence of 2 proposed workplace-based entrustment scales.
The authors of this multisite, randomized, experimental study used structured vignettes and experienced raters to examine validity evidence of the Ottawa scale and the UME supervisory tool (Chen scale) in 2019. The authors used a series of 8 cases (6 developed de novo) depicting learners at preentrustable (less-developed) and entrustable (more-developed) skill levels across 5 Core EPAs. Participants from Core EPA pilot institutions rated learner performance using either the Ottawa or Chen scale. The authors used descriptive statistics and analysis of variance to examine data trends and compare ratings, conducted interrater reliability and generalizability studies to evaluate consistency among participants, and performed a content analysis of narrative comments.
Fifty clinician-educators from 10 institutions participated, yielding 579 discrete EPA assessments. Both Ottawa and Chen scales differentiated between less- and more-developed skill levels (P < .001). The interclass correlation was good to excellent for all EPAs using Ottawa (range, 0.68-0.91) and fair to excellent using Chen (range, 0.54-0.83). Generalizability analysis revealed substantial variance in ratings attributable to the learner-EPA interaction (59.6% for Ottawa; 48.9% for Chen) suggesting variability for ratings was appropriately associated with performance on individual EPAs.
In a structured setting, both the Ottawa and Chen scales distinguished between preentrustable and entrustable learners; however, the Ottawa scale demonstrated more desirable characteristics. These findings represent a critical step forward in developing valid, reliable instruments to measure learner progression toward entrustment for the Core EPAs.
在本科医学教育(UME)中,以 13 项核心可委托专业活动(Core EPAs)为基础的以能力为基础的医学教育已经付诸实践。需要使用严格、有效、可靠的工作场所直接观察措施,为毕业生是否准备好进入住院医师实习期提供总结性决策。本研究的目的是调查两种拟议的基于工作场所的委托量表的效度证据。
本多站点、随机、实验研究的作者使用结构化病例和经验丰富的评估者,于 2019 年考察了渥太华量表和 UME 监督工具(Chen 量表)的效度证据。作者使用了一系列 8 个案例(其中 6 个是新开发的),描绘了在 5 个 Core EPAs 中处于可委托前(欠发达)和可委托(更发达)技能水平的学习者。来自 Core EPA 试点机构的参与者使用渥太华或 Chen 量表对学习者的表现进行评分。作者使用描述性统计和方差分析来检查数据趋势并比较评分,进行评估者间可靠性和可推广性研究以评估参与者之间的一致性,并对叙事性评论进行内容分析。
来自 10 个机构的 50 名临床教育者参与了研究,共进行了 579 次单独的 EPA 评估。渥太华和 Chen 量表都能区分欠发达和更发达的技能水平(P <.001)。所有使用渥太华量表的 EPAs 的组内相关系数均为良好至优秀(范围为 0.68-0.91),使用 Chen 量表的则为良好至优秀(范围为 0.54-0.83)。可推广性分析显示,学习者-EPA 交互作用导致评分存在很大差异(渥太华为 59.6%;Chen 为 48.9%),这表明评分的可变性与各个 EPA 的表现适当相关。
在结构化环境中,渥太华和 Chen 量表都能区分可委托前和可委托学习者;然而,渥太华量表表现出了更理想的特征。这些发现是朝着开发用于衡量 Core EPAs 中学习者向委托学习进展的有效、可靠工具迈出的重要一步。