Lamprianou Iasonas
University of Cyprus, Nicosia, Cyprus.
Educ Psychol Meas. 2018 Jun;78(3):430-459. doi: 10.1177/0013164416689696. Epub 2017 Feb 5.
It is common practice for assessment programs to organize qualifying sessions during which the raters (often known as "markers" or "judges") demonstrate their consistency before operational rating commences. Because of the high-stakes nature of many rating activities, the research community tends to continuously explore new methods to analyze rating data. We used simulated and empirical data from two high-stakes language assessments, to propose a new approach, based on social network analysis and exponential graph models, to evaluate the readiness of a group of raters for operational rating. The results of this innovative approach are compared with the results of a Rasch analysis, which is a well-established approach for the analysis of such data. We also demonstrate how the new approach can be practically used to investigate important research questions such as whether rater severity is stable across rating tasks. The merits of the new approach, and the consequences for practice are discussed.
评估项目通常会组织资格评定环节,在此期间评分者(通常称为“打分员”或“评判员”)在正式评分开始前展示其评分的一致性。由于许多评分活动具有高风险性质,研究界倾向于不断探索分析评分数据的新方法。我们使用了来自两项高风险语言评估的模拟数据和实证数据,提出了一种基于社会网络分析和指数图模型的新方法,以评估一组评分者进行正式评分的准备情况。将这种创新方法的结果与Rasch分析的结果进行了比较,Rasch分析是一种成熟的此类数据分析方法。我们还展示了如何实际使用新方法来研究重要的研究问题,例如评分者的严格程度在不同评分任务中是否稳定。讨论了新方法的优点及其对实践的影响。