Burr Steven A, Whittle John, Fairclough Lucy C, Coombes Lee, Todd Ian
Collaboration for the Advancement of Medical Education Research and Assessment (CAMERA), Peninsula Schools of Medicine and Dentistry, Plymouth University, Devon, PL4 8AA, UK.
School of Medicine, University of Nottingham, Queen's Medical Centre, Nottingham, NG7 2UH, UK.
BMC Med Educ. 2016 Jan 28;16:34. doi: 10.1186/s12909-016-0555-y.
Fixed mark grade boundaries for non-linear assessment scales fail to account for variations in assessment difficulty. Where assessment difficulty varies more than ability of successive cohorts or the quality of the teaching, anchoring grade boundaries to median cohort performance should provide an effective method for setting standards.
This study investigated the use of a modified Hofstee (MH) method for setting unsatisfactory/satisfactory and satisfactory/excellent grade boundaries for multiple choice question-style assessments, adjusted using the cohort median to obviate the effect of subjective judgements and provision of grade quotas.
Outcomes for the MH method were compared with formula scoring/correction for guessing (FS/CFG) for 11 assessments, indicating that there were no significant differences between MH and FS/CFG in either the effective unsatisfactory/satisfactory grade boundary or the proportion of unsatisfactory graded candidates (p > 0.05). However the boundary for excellent performance was significantly higher for MH (p < 0.01), and the proportion of candidates returned as excellent was significantly lower (p < 0.01). MH also generated performance profiles and pass marks that were not significantly different from those given by the Ebel method of criterion-referenced standard setting.
This supports MH as an objective model for calculating variable grade boundaries, adjusted for test difficulty. Furthermore, it easily creates boundaries for unsatisfactory/satisfactory and satisfactory/excellent performance that are protected against grade inflation. It could be implemented as a stand-alone method of standard setting, or as part of the post-examination analysis of results for assessments for which pre-examination criterion-referenced standard setting is employed.
非线性评估量表的固定分数等级界限无法考虑评估难度的变化。当评估难度的变化超过连续几届学生的能力或教学质量时,将等级界限锚定到学生队列的中位数表现应能提供一种有效的标准设定方法。
本研究调查了使用改良的霍夫斯泰方法(MH)为多项选择题式评估设定不满意/满意以及满意/优秀等级界限,通过队列中位数进行调整以消除主观判断和等级配额提供的影响。
将MH方法的结果与11项评估的公式计分/猜测校正(FS/CFG)进行比较,结果表明在有效不满意/满意等级界限或不满意等级考生比例方面,MH与FS/CFG之间没有显著差异(p>0.05)。然而,MH的优秀表现界限显著更高(p<0.01),被评为优秀的考生比例显著更低(p<0.01)。MH生成的表现概况和及格分数与埃贝尔标准参照标准设定方法给出的结果也没有显著差异。
这支持将MH作为一种客观模型来计算可变等级界限,并根据测试难度进行调整。此外,它能轻松创建不满意/满意以及满意/优秀表现的界限,防止分数膨胀。它可以作为一种独立的标准设定方法实施,也可以作为采用考试前标准参照标准设定的评估结果考试后分析的一部分。