School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC, Australia.
Department of Anaesthesia and Perioperative Medicine, Monash Health, Clayton, VIC, Australia.
Can J Anaesth. 2019 Feb;66(2):193-200. doi: 10.1007/s12630-018-1251-7. Epub 2018 Nov 14.
Competency-based anesthesia training programs require robust assessment of trainee performance and commonly combine different types of workplace-based assessment (WBA) covering multiple facets of practice. This study measured the reliability of WBAs in a large existing database and explored how they could be combined to optimize reliability for assessment decisions.
We used generalizability theory to measure the composite reliability of four different types of WBAs used by the Australian and New Zealand College of Anaesthetists: mini-Clinical Evaluation Exercise (mini-CEX), direct observation of procedural skills (DOPS), case-based discussion (CbD), and multi-source feedback (MSF). We then modified the number and weighting of WBA combinations to optimize reliability with fewer assessments.
We analyzed 67,405 assessments from 1,837 trainees and 4,145 assessors. We assumed acceptable reliability for interim (intermediate stakes) and final (high stakes) decisions of 0.7 and 0.8, respectively. Depending on the combination of WBA types, 12 assessments allowed the 0.7 threshold to be reached where one assessment of any type has the same weighting, while 20 were required for reliability to reach 0.8. If the weighting of the assessments is optimized, acceptable reliability for interim and final decisions is possible with nine (e.g., two DOPS, three CbD, two mini-CEX, two MSF) and 15 (e.g., two DOPS, eight CbD, three mini-CEX, two MSF) assessments respectively.
Reliability is an important factor to consider when designing assessments, and measuring composite reliability can allow the selection of a WBA portfolio with adequate reliability to provide evidence for defensible decisions on trainee progression.
基于能力的麻醉培训计划需要对学员的表现进行全面评估,通常会结合多种不同类型的基于工作场所的评估(WBA),涵盖实践的多个方面。本研究使用概化理论测量了澳大利亚和新西兰麻醉学院(Australian and New Zealand College of Anaesthetists)使用的四种不同类型的 WBA 的综合可靠性,并探讨了如何将它们结合起来,以优化评估决策的可靠性。
我们使用概化理论来衡量澳大利亚和新西兰麻醉学院使用的四种不同类型的 WBA 的综合可靠性:迷你临床评估练习(mini-CEX)、程序性技能的直接观察(DOPS)、基于案例的讨论(CbD)和多源反馈(MSF)。然后,我们修改了 WBA 组合的数量和权重,以通过减少评估次数来优化可靠性。
我们分析了 1837 名学员和 4145 名评估者的 67405 次评估。我们假设中期(中间风险)和最终(高风险)决策的可接受可靠性分别为 0.7 和 0.8。根据 WBA 类型的组合,12 次评估可以达到 0.7 的阈值,其中任何类型的一次评估具有相同的权重,而 20 次评估则需要达到 0.8 的可靠性。如果优化评估的权重,那么可以通过 9 次(例如,2 次 DOPS、3 次 CbD、2 次 mini-CEX、2 次 MSF)和 15 次(例如,2 次 DOPS、8 次 CbD、3 次 mini-CEX、2 次 MSF)评估来实现中期和最终决策的可接受可靠性。
在设计评估时,可靠性是一个重要的考虑因素,测量综合可靠性可以使评估组合的选择具有足够的可靠性,为学员进展的可辩护决策提供证据。