School of Medicine, University of Notre Dame, Sydney, New South Wales, Australia.
Adelaide Medical School, University of Adelaide, Adelaide, South Australia, Australia.
Med Educ. 2018 Mar;52(3):336-346. doi: 10.1111/medu.13495. Epub 2018 Jan 9.
A script concordance test (SCT) is a modality for assessing clinical reasoning. Concerns had been raised about the plausible validity threat to SCT scores if students deliberately avoided the extreme answer options to obtain higher scores. The aims of the study were firstly to investigate whether students' avoidance of the extreme answer options could result in higher scores, and secondly to determine whether a 'balanced approach' by careful construction of SCT items (to include extreme as well as median options as model responses) would improve the validity of an SCT.
Using the paired sample t-test, the actual average student scores for 10 SCT papers from 2012-2016 were compared with simulated scores. The latter were generated by recoding all '-2' responses to '-1' and '+2' responses to '+1' for the whole and bottom 10% of the cohort (simulation 1), and scoring as if all students had chosen '0' for their responses (simulation 2). The actual average and simulated average scores in 2012 (before the 'balanced approach') were compared with those from 2013-2016, when papers had a good balance of modal responses from the expert reference panel.
In 2012, a score increase was seen in simulation 1 in the third-year cohort, from 50.2 to 55.6% (t [10] = 4.818; p = 0.001). Since 2013, with the 'balanced approach', the actual SCT scores (57.4%) were significantly higher than scores in both simulation 1 and simulation 2 (46.7% and 23.9% respectively).
When constructing SCT examinations, apart from the rigorous pre-examination optimisation, it is desirable to achieve a balance between items that attract extreme responses and those that attract median response options. This could mitigate the validity threat to SCT scores, especially for the low-performing students who have previously been shown to only select median responses and avoid the extreme responses.
脚本一致性测试(SCT)是一种评估临床推理的方法。人们担心,如果学生故意避免选择极端答案选项以获得更高的分数,那么 SCT 分数可能会受到合理的有效性威胁。本研究的目的首先是调查学生是否可以避免选择极端答案选项从而获得更高的分数,其次是确定通过仔细构建 SCT 项目(包括极端和中位数选项作为模型答案)来平衡方法是否可以提高 SCT 的有效性。
使用配对样本 t 检验,比较了 2012-2016 年 20 名学生的 10 份 SCT 试卷的实际平均成绩和模拟成绩。后者通过将整个队列的底部 10%的所有-2 答案重新编码为-1,将所有+2 答案重新编码为+1 来生成(模拟 1),并将所有学生的答案都编码为 0 来评分(模拟 2)。2012 年(采用平衡方法之前)的实际平均成绩和模拟平均成绩与 2013-2016 年的成绩进行了比较,此时试卷的专家参考小组给出了良好的模态答案平衡。
在 2012 年,三年级队列的模拟 1 中,分数从 50.2%增加到 55.6%(t [10] = 4.818;p = 0.001)。自 2013 年以来,采用平衡方法,实际 SCT 分数(57.4%)明显高于模拟 1 和模拟 2 的分数(分别为 46.7%和 23.9%)。
在构建 SCT 考试时,除了严格的考前优化外,还需要在吸引极端反应的项目和吸引中位数反应选项的项目之间取得平衡。这可以减轻对 SCT 分数的有效性威胁,尤其是对那些以前只选择中位数反应并避免选择极端反应的表现不佳的学生。