Lee HyeSun, Geisinger Kurt F
University of Nebraska-Lincoln, NE, USA.
Educ Psychol Meas. 2016 Feb;76(1):141-163. doi: 10.1177/0013164415585166. Epub 2015 May 18.
The current study investigated the impact of matching criterion purification on the accuracy of differential item functioning (DIF) detection in large-scale assessments. The three matching approaches for DIF analyses (block-level matching, pooled booklet matching, and equated pooled booklet matching) were employed with the Mantel-Haenszel procedure. Five factors-the length of a test, the proportion of items exhibiting DIF, a sample size, a ratio of a reference and focal group, and the existence of an average ability difference between two groups-were manipulated. The three matching approaches were used with and without purification. Also, a systematic test form difference was considered. The results indicated that overall, matching criterion purification in the three approaches contributed to the improvement of power in the detection of DIF. Depending on the psychometric characteristics of items exhibiting DIF and the existence of an average ability difference, the amount of power improvement due to matching criterion purification was different across the three approaches. The purification of a matching criterion contributed to the slight reduction of Type I error rates in the three approaches when no mean ability difference existed between the two groups. Considering power improvement with the control of Type I error rates, the purification of a matching criterion in the pooled booklet matching and the equated pooled booklet matching approaches can be recommended for DIF analyses in large-scale assessments.
本研究调查了匹配标准净化对大规模评估中差异项目功能(DIF)检测准确性的影响。采用Mantel-Haenszel程序对DIF分析的三种匹配方法(块级匹配、合并手册匹配和等值合并手册匹配)进行了研究。对五个因素进行了操控,包括测试长度、表现出DIF的项目比例、样本量、参照组与目标组的比例以及两组之间平均能力差异的存在情况。三种匹配方法分别在有净化和无净化的情况下使用。此外,还考虑了系统的试卷差异。结果表明,总体而言,三种方法中的匹配标准净化有助于提高DIF检测的功效。根据表现出DIF的项目的心理测量特征以及平均能力差异的存在情况,三种方法中因匹配标准净化而带来的功效提高量有所不同。当两组之间不存在平均能力差异时,匹配标准的净化在三种方法中有助于轻微降低I型错误率。考虑到在控制I型错误率的同时提高功效,对于大规模评估中的DIF分析,推荐采用合并手册匹配和等值合并手册匹配方法中的匹配标准净化。