Department of Otolaryngology, University of California San Francisco, San Francisco, CA, USA.
Department of Surgery, University of British Columbia, Vancouver, BC, Canada.
Adv Health Sci Educ Theory Pract. 2023 Aug;28(3):793-809. doi: 10.1007/s10459-022-10189-5. Epub 2022 Nov 28.
Clinical supervisors are known to assess trainee performance idiosyncratically, causing concern about the validity of their ratings. The literature on this issue relies heavily on retrospective collection of decisions, resulting in the risk of inaccurate information regarding what actually drives raters' perceptions. Capturing in-the-moment information about supervisors' impressions could yield better insight into how to intervene. The purpose of this study, therefore, was to gather "real-time" judgments to explore what drives preceptors' judgments of student performance. We performed a prospective study in which physicians were asked to adjust a rating scale in real-time while watching two video-recordings of trainee clinical performances. Scores were captured in 1-s increments, examined for frequency, direction, and magnitude of adjustments, and compared to assessors' final entrustability judgment as measured by the modified Ottawa Clinic Assessment Tool. The standard deviation in raters' judgment was examined as a function of time to determine how long it takes impressions to begin to vary. 20 participants viewed 2 clinical vignettes. Considerable variability in ratings was observed with different behaviours triggering scale adjustments for different raters. That idiosyncrasy occurred very quickly, with the standard deviation in raters' judgments rapidly increasing within 30 s of case onset. Particular moments appeared to generally be influential, but their degree of influence still varied. Correlations between the final assessment and (a) score assigned upon first adjustment of the scale, (b) upon last adjustment, and (c) the mean score, were r = 0.13, 0.32, and 0.57 for one video and r = 0.30, 0.50, and 0.52 for the other, indicating the degree to which overall impressions reflected accumulation of raters' idiosyncratic moment-by-moment observations. Our results demonstrated that variability in raters' impressions begins very early in a case presentation and is associated with different behaviours having different influence on different raters. More generally, this study outlines a novel methodology that offers a new path for gaining insight into factors influencing assessor judgments.
临床主管对学员表现的评估方式因人而异,这引起了人们对其评分有效性的担忧。关于这个问题的文献主要依赖于对决策的回顾性收集,这导致了有关评分者感知的实际驱动因素的信息不准确的风险。捕捉主管印象的即时信息可以更好地了解如何进行干预。因此,这项研究的目的是收集“实时”判断,以探讨是什么驱动了导师对学生表现的判断。我们进行了一项前瞻性研究,要求医生在观看两名学员临床表现的视频记录时实时调整评分量表。分数以 1 秒为增量进行捕捉,检查调整的频率、方向和幅度,并与修改后的渥太华诊所评估工具(modified Ottawa Clinic Assessment Tool)衡量的评估者最终委托判断进行比较。还检查了评分者判断的标准差作为时间的函数,以确定印象开始变化需要多长时间。20 名参与者观看了 2 个临床病例。观察到不同行为会触发不同评分者调整量表,评分存在很大差异。这种特质很快就出现了,在案例开始后的 30 秒内,评分者判断的标准差迅速增加。特定的时刻似乎普遍具有影响力,但它们的影响力程度仍然存在差异。最终评估与(a)首次调整量表时分配的分数、(b)最后一次调整时的分数和(c)平均分数之间的相关性,对于一个视频,r 分别为 0.13、0.32 和 0.57,对于另一个视频,r 分别为 0.30、0.50 和 0.52,表明整体印象在多大程度上反映了评分者即时观察的积累。我们的结果表明,评分者印象的变化在案例呈现的早期就开始了,并且与不同行为对不同评分者的不同影响有关。更普遍地说,这项研究概述了一种新的方法,为深入了解影响评估者判断的因素提供了新的途径。