Emergency Medicine, Hennepin County Medical Center, University of Minnesota Medical School, Minneapolis, MN.
Department of Emergency Medicine, Lehigh Valley Health Network, Allentown, PA.
Acad Emerg Med. 2018 Feb;25(2):205-220. doi: 10.1111/acem.13296. Epub 2017 Nov 9.
All residency programs in the United States are required to report their residents' progress on the milestones to the Accreditation Council for Graduate Medical Education (ACGME) biannually. Since the development and institution of this competency-based assessment framework, residency programs have been attempting to ascertain the best ways to assess resident performance on these metrics. Simulation was recommended by the ACGME as one method of assessment for many of the milestone subcompetencies. We developed three simulation scenarios with scenario-specific milestone-based assessment tools. We aimed to gather validity evidence for this tool.
We conducted a prospective observational study to investigate the validity evidence for three mannequin-based simulation scenarios for assessing individual residents on emergency medicine (EM) milestones. The subcompetencies (i.e., patient care [PC]1, PC2, PC3) included were identified via a modified Delphi technique using a group of experienced EM simulationists. The scenario-specific checklist (CL) items were designed based on the individual milestone items within each EM subcompetency chosen for assessment and reviewed by experienced EM simulationists. Two independent live raters who were EM faculty at the respective study sites scored each scenario following brief rater training. The inter-rater reliability (IRR) of the assessment tool was determined by measuring intraclass correlation coefficient (ICC) for the sum of the CL items as well as the global rating scales (GRSs) for each scenario. Comparing GRS and CL scores between various postgraduate year (PGY) levels was performed with analysis of variance.
Eight subcompetencies were chosen to assess with three simulation cases, using 118 subjects. Evidence of test content, internal structure, response process, and relations with other variables were found. The ICCs for the sum of the CL items and the GRSs were >0.8 for all cases, with one exception (clinical management GRS = 0.74 in sepsis case). The sum of CL items and GRSs (p < 0.05) discriminated between PGY levels on all cases. However, when the specific CL items were mapped back to milestones in various proficiency levels, the milestones in the higher proficiency levels (level 3 [L3] and 4 [L4]) did not often discriminate between various PGY levels. L3 milestone items discriminated between PGY levels on five of 12 occasions they were assessed, and L4 items discriminated only two of 12 times they were assessed.
Three simulation cases with scenario-specific assessment tools allowed evaluation of EM residents on proficiency L1 to L4 within eight of the EM milestone subcompetencies. Evidence of test content, internal structure, response process, and relations with other variables were found. Good to excellent IRR and the ability to discriminate between various PGY levels was found for both the sum of CL items and the GRSs. However, there was a lack of a positive relationship between advancing PGY level and the completion of higher-level milestone items (L3 and L4).
美国所有住院医师培训计划都需要向研究生医学教育认证委员会(ACGME)每半年报告一次居民在里程碑上的进展情况。自建立这个基于能力的评估框架以来,住院医师培训计划一直在尝试确定评估居民在这些指标上表现的最佳方法。ACGME 推荐模拟作为许多里程碑子能力的评估方法之一。我们开发了三个具有特定模拟场景的工具,以评估急诊医学(EM)里程碑上的个人住院医师。使用一组经验丰富的 EM 模拟专家,通过修改后的 Delphi 技术确定了包括患者护理[PC]1、PC2 和 PC3 在内的子能力。特定于场景的清单(CL)项目是根据为评估而选择的每个 EM 子能力中的单个 EM 里程碑项目设计的,并由经验丰富的 EM 模拟专家进行了审查。两位在各自研究地点担任急诊医学教员的独立现场评估员在进行了简短的评估员培训后,对每个场景进行了评分。通过测量 CL 项目总和的组内相关系数(ICC)以及每个场景的全球评分量表(GRS)来确定评估工具的组内相关系数(IRR)。通过方差分析比较不同住院医师年(PGY)水平之间的 GRS 和 CL 分数。
选择了三个模拟案例来评估 8 个子能力,共有 118 名受试者参加。发现了测试内容、内部结构、反应过程以及与其他变量的关系的证据。所有情况下 CL 项目总和和 GRS 的 ICC 均>0.8,但有一个例外(脓毒症案例中的临床管理 GRS=0.74)。CL 项目总和和 GRS(p<0.05)在所有情况下都可以区分 PGY 水平。然而,当特定的 CL 项目与各个熟练水平的里程碑进行映射时,较高熟练水平(第 3 级[L3]和第 4 级[L4])的里程碑并不总是能区分不同的 PGY 水平。在 12 次评估中,有 5 次 L3 里程碑项目能够区分 PGY 水平,而只有 2 次 L4 里程碑项目能够区分 PGY 水平。
三个具有特定场景评估工具的模拟案例允许在 8 个 EM 里程碑子能力中的 12 个 EM 子能力中评估 EM 住院医师的熟练程度从 1 级到 4 级。发现了测试内容、内部结构、反应过程以及与其他变量的关系的证据。CL 项目总和和 GRS 的 IRR 均为良好到优秀,并且能够区分不同的 PGY 水平。然而,在 PGY 水平的提高与更高水平的里程碑项目(L3 和 L4)的完成之间并没有发现积极的关系。