McKinley Danette W, Boulet John R
Research and Evaluation, Educational Commission for Foreign Medical Graduates, 3624 Market Street, 4th Floor, Philadelphia, PA 19104, USA.
Adv Health Sci Educ Theory Pract. 2004;9(1):29-38. doi: 10.1023/B:AHSE.0000012214.40340.03.
Although studies have been conducted to examine the effects of a variety of factors on the comparability of scores obtained from standardized patient examinations (SPE), little research has been conducted to specifically investigate the challenge of detecting drift in case difficulty estimates over time, particularly for large-scale, performance-based, assessments. The purpose of the current study was to investigate the use of a procedure to detect drift in the difficulty estimates for a large-scale, high stakes SPE. The results of this investigation suggest that, for particular performance tasks, there was some variation in mean scores over time. These findings indicate that, although it is feasible to create a bank of case-SP means and link scores back to these fixed estimates, special attention must be paid to the standardization of exam materials over time. This is essential to ensure comparability of scores and pass-fail decisions for candidates who are assessed on multiple test forms throughout the year.
尽管已经开展了多项研究来考察各种因素对标准化患者检查(SPE)所得分数可比性的影响,但针对随着时间推移检测病例难度估计值漂移这一挑战,尤其是在大规模、基于表现的评估中,开展的研究却很少。本研究的目的是探讨一种程序在大规模、高风险SPE难度估计中检测漂移的应用。该调查结果表明,对于特定的表现任务,平均分数随时间存在一些变化。这些发现表明,虽然创建一组病例-SP均值库并将分数与这些固定估计值关联起来是可行的,但必须特别关注考试材料随时间的标准化。这对于确保全年通过多种测试形式进行评估的考生分数的可比性以及及格/不及格判定至关重要。