Kreiter C D, Ferguson K, Lee W C, Brennan R L, Densen P
Office of Consultation and Research in Medical Education, University of Iowa, Iowa City 52242-1008, USA.
Acad Med. 1998 Dec;73(12):1294-8. doi: 10.1097/00001888-199812000-00021.
To investigate the measurement characteristics of standardized clinical evaluation forms (CEFs) used to assign grades for clerkship performance.
In 1996-97, the authors reviewed 5,168 CEFs completed for 175 students in eight clerkships. Limiting their analysis to the three clerkships that produced the most CEFs, the authors conducted a generalizability study to determine the five variance components for each clerkship. A decision study then calculated the generalizability coefficients and standard errors of measurement in each clerkship for varied numbers of raters and CEF items.
The generalizability study found large variance components attributable to rater and rating context. The decision study found that, when three or more raters completed CEFs for a student, the generalizability coefficient and standard error of measurement reached levels acceptable for grading. Increasing the number of items on the CEF had no significant effect.
The reliability of assigning students clerkship grades based on single CEFs is unacceptably low. However, CEFs can accurately measure students' clerkship performances if completed by three or more raters.
研究用于评定实习表现等级的标准化临床评估表(CEF)的测量特征。
在1996 - 1997年,作者审查了为8个实习科室的175名学生填写的5168份CEF。作者将分析局限于产生CEF数量最多的3个实习科室,进行了一项概化性研究,以确定每个实习科室的5个方差成分。然后进行一项决策研究,计算每个实习科室在不同评分者数量和CEF项目数量情况下的概化系数和测量标准误。
概化性研究发现,评分者和评分情境导致了较大的方差成分。决策研究发现,当有三名或更多评分者为一名学生填写CEF时,概化系数和测量标准误达到可接受的评分水平。增加CEF上的项目数量没有显著影响。
基于单一CEF评定学生实习成绩的可靠性低得令人无法接受。然而,如果由三名或更多评分者填写CEF,CEF能够准确测量学生的实习表现。