Rios Joseph A
University of Minnesota, Minneapolis, MN, USA.
Educ Psychol Meas. 2021 Oct;81(5):957-979. doi: 10.1177/0013164421990429. Epub 2021 Feb 12.
Low test-taking effort as a validity threat is common when examinees perceive an assessment context to have minimal personal value. Prior research has shown that in such contexts, subgroups may differ in their effort, which raises two concerns when making subgroup mean comparisons. First, it is unclear how differential effort could influence evaluations of scale property equivalence. Second, if attaining full scalar invariance, the degree to which differential effort can bias subgroup mean comparisons is unknown. To address these issues, a simulation study was conducted to examine the influence of differential noneffortful responding (NER) on evaluations of measurement invariance and latent mean comparisons. Results showed that as differential rates of NER grew, increased Type I errors of measurement invariance were observed only at the metric invariance level, while no negative effects were apparent for configural or scalar invariance. When full scalar invariance was correctly attained, differential NER led to bias of mean score comparisons as large as 0.18 standard deviations with a differential NER rate of 7%. These findings suggest that test users should evaluate and document potential differential NER prior to both conducting measurement quality analyses and reporting disaggregated subgroup mean performance.
当考生认为评估情境对个人价值微乎其微时,低考试投入作为一种效度威胁很常见。先前的研究表明,在这种情境下,不同亚组的投入可能存在差异,这在进行亚组均值比较时引发了两个问题。首先,尚不清楚不同的投入如何影响量表属性等价性的评估。其次,如果实现了完全标量不变性,不同投入对亚组均值比较产生偏差的程度尚不清楚。为了解决这些问题,进行了一项模拟研究,以检验不同的非努力作答(NER)对测量不变性评估和潜在均值比较的影响。结果表明,随着NER差异率的增加,仅在度量不变性水平上观察到测量不变性的I型错误增加,而对于构型或标量不变性没有明显的负面影响。当正确实现完全标量不变性时,在NER差异率为7%的情况下,不同的NER会导致均值分数比较偏差高达0.18个标准差。这些发现表明,测试使用者在进行测量质量分析和报告亚组均值表现之前,应评估并记录潜在的不同NER。