Reid Katharine J, Chiavaroli Neville G, Bilszta Justin L C
Department of Medical Education, The University of Melbourne, Melbourne, Australia.
The Australian Council for Educational Research, Melbourne, Australia.
J Med Educ Curric Dev. 2022 Feb 24;9:23821205221081813. doi: 10.1177/23821205221081813. eCollection 2022 Jan-Dec.
Rubrics are utilized extensively in tertiary contexts to assess student performance on written tasks; however, their use for assessment of research projects has received little attention. In particular, there is little evidence on the reliability of examiner judgements according to rubric type (general or specific) in a research context. This research examines the concordance between pairs of examiners assessing a medical student research project during a two-year period employing a generic rubric followed by a subsequent two-year implementation of task-specific rubrics. Following examiner feedback, and with consideration to the available literature, we expected the task-specific rubrics would increase the consistency of examiner judgements and reduce the need for arbitration due to discrepant marks. However, in contrast, results showed that generic rubrics provided greater consistency of examiner judgements and fewer arbitrations compared with the task-specific rubrics. These findings have practical implications for educational practise in the assessment of research projects and contribute valuable empirical evidence to inform the development and use of rubrics in medical education.
评分标准在高等教育环境中被广泛用于评估学生的书面作业表现;然而,其在研究项目评估中的应用却很少受到关注。特别是,在研究背景下,关于根据评分标准类型(通用或特定)考官判断的可靠性几乎没有证据。本研究考察了在两年期间使用通用评分标准以及随后两年使用特定任务评分标准时,评估医学生研究项目的考官对之间的一致性。根据考官反馈,并考虑到现有文献,我们预计特定任务评分标准将提高考官判断的一致性,并减少因分数差异而进行仲裁的必要性。然而,相比之下,结果表明,与特定任务评分标准相比,通用评分标准提供了更高的考官判断一致性和更少的仲裁。这些发现对研究项目评估中的教育实践具有实际意义,并为医学教育中评分标准的制定和使用提供了有价值的实证证据。