Lurie Stephen J, Mooney Christopher J, Nofziger Anne C, Meldrum Sean C, Epstein Ronald M
Office of Educational Evaluation and Research, University of Rochester School of Medicine and Dentistry, Rochester, New York, USA.
Med Educ. 2008 Jul;42(7):662-8. doi: 10.1111/j.1365-2923.2008.03080.x. Epub 2008 May 23.
Subjective rating scales for communication skills may yield more personally meaningful responses than more standardised rating schemes. It is unclear, however, whether such evaluations may be overly biased by respondents' rating styles, which may lead to unreliable measurement of examinees' communication skills.
Our study involved 212 students from the classes of 2005 and 2006 at the University of Rochester School of Medicine and Dentistry. All students were rated by actors depicting standardised patients (SPs) on the same seven cases using the 19-item Rochester Communication Rating Scale (RCRS). Different students were assigned to different actors playing the same SP. We assessed the extent to which actors' personal rating styles influenced the scores they assigned to students. Main outcome measures were: between-actor variability in responses; the degree to which actors' response styles contribute to overall scores, and improvements in reliability achieved by standardising actors' ratings.
There were statistically significant differences between actors in their mean assigned scores. Scores aggregated over 18 separate SP cases have an expected generalisability coefficient of 0.79. If raw RCRS scores are used, a total of 27 replications of the RCRS are required to achieve a Cronbach's alpha of 0.8; standardisation reduces this number to 18.
Although actors are variable in their use of a standardised subjective scale of communication, such differences contribute to an acceptably small proportion of the total variance if scores are combined across a large number of cases. Reliability can be markedly improved by standardising scores across raters.
与更标准化的评分方案相比,沟通技能的主观评分量表可能会产生更具个人意义的回答。然而,尚不清楚此类评估是否可能受到受访者评分风格的过度影响,这可能导致对考生沟通技能的测量不可靠。
我们的研究涉及罗切斯特大学医学与牙科学院2005级和2006级的212名学生。所有学生由扮演标准化病人(SP)的演员使用19项罗切斯特沟通评分量表(RCRS)对相同的7个病例进行评分。不同的学生被分配给扮演同一SP的不同演员。我们评估了演员的个人评分风格对他们给学生评分的影响程度。主要结局指标为:演员之间回答的变异性;演员的回答风格对总分的贡献程度,以及通过标准化演员评分所实现的信度提高。
演员之间的平均评分存在统计学显著差异。汇总18个不同SP病例的评分,预期的概化系数为0.79。如果使用原始RCRS分数,要使克朗巴哈系数达到0.8,RCRS总共需要重复27次;标准化后这一数字降至18次。
尽管演员在使用标准化主观沟通量表方面存在差异,但如果将大量病例的评分合并,此类差异在总方差中所占比例小到可以接受。通过对评分者的分数进行标准化,可以显著提高信度。