Zhao Haiyan, Andersson Björn, Guo Boliang, Xin Tao
Faculty of Psychology, Beijing Normal UniversityBeijing, China.
Beijing Education Examinations AuthorityBeijing, China.
Front Psychol. 2017 Jun 7;8:933. doi: 10.3389/fpsyg.2017.00933. eCollection 2017.
Writing assessments are an indispensable part of most language competency tests. In our research, we used cross-classified models to study rater effects in the real essay rating process of a large-scale, high-stakes educational examination administered in China in 2011. Generally, four cross-classified models are suggested for investigation of rater effects: (1) the existence of sequential effects, (2) the direction of the sequential effects, and (3) differences in raters by their individual characteristics. We applied these models to the data to account for possible cluster effects caused by the application of multiple rating strategies. The results of our research showed that raters demonstrated sequential effects during the rating process. In contrast to many other studies on rater effects, our study found that raters exhibited assimilation effects. The more experienced, lenient, and qualified raters were less susceptible to assimilation effects. In addition, our research demonstrated the feasibility and appropriateness of using cross-classified models in assessing rater effects for such data structures. This paper also discusses the implications for educators and practitioners who are interested in reducing sequential effects in the rating process, and suggests directions for future research.
写作评估是大多数语言能力测试中不可或缺的一部分。在我们的研究中,我们使用交叉分类模型来研究评分者效应,该研究针对的是2011年在中国举行的一场大规模、高风险教育考试的真实作文评分过程。一般来说,建议使用四种交叉分类模型来调查评分者效应:(1)顺序效应的存在;(2)顺序效应的方向;以及(3)评分者因其个人特征而产生的差异。我们将这些模型应用于数据,以考虑因应用多种评分策略而可能产生的聚类效应。我们的研究结果表明,评分者在评分过程中表现出顺序效应。与许多其他关于评分者效应的研究不同,我们的研究发现评分者表现出同化效应。经验更丰富、更宽容且更合格的评分者较不易受到同化效应的影响。此外,我们的研究证明了使用交叉分类模型评估此类数据结构的评分者效应的可行性和适用性。本文还讨论了对有兴趣减少评分过程中顺序效应的教育工作者和从业者的启示,并提出了未来研究的方向。