Wolfe E W, Moulder B C, Myford C M
Michigan State University, East Lansing 48840, USA.
J Appl Meas. 2001;2(3):256-80.
This paper describes a class of rater effects that depict rater-by-time interactions. We refer to this class of rater effects as DRIFT differential rater functioning over time. This article describes several types of DRIFT (primacy/recency, differential centrality/extremism, and practice/fatigue) and Rasch measurement procedures designed to identify these types of DRIFT in rating data. These procedures are applied to simulated data and are shown to be useful in classifying raters as being aberrant or non-aberrant for primacy, recency, and differential centrality and extremism, particularly for moderate or larger effect sizes. Rates of correct classification for practice and fatigue were lower and statistical power exceeded.50 only with very large effect sizes. Type I error rates (i.e., incorrect nomination) were near expected levels in all cases.
本文描述了一类描述评分者与时间交互作用的评分者效应。我们将这类评分者效应称为DRIFT(随时间变化的差异评分者功能)。本文介绍了几种类型的DRIFT(首因/近因、差异中心性/极端性以及练习/疲劳)以及旨在识别评分数据中这些类型DRIFT的Rasch测量程序。这些程序应用于模拟数据,并被证明在将评分者分类为在首因、近因、差异中心性和极端性方面异常或正常时非常有用,特别是对于中等或更大的效应量。练习和疲劳的正确分类率较低,只有在效应量非常大时统计功效才超过0.50。在所有情况下,I型错误率(即错误提名)接近预期水平。