Jin Kuan-Yu, Chen Hui-Fang, Wang Wen-Chung
The University of Hong Kong, Pokfulam, Hong Kong.
City University of Hong Kong, Kowloon, Hong Kong.
Appl Psychol Meas. 2018 Nov;42(8):613-629. doi: 10.1177/0146621618762738. Epub 2018 Mar 21.
Differential item functioning (DIF) makes test scores incomparable and substantially threatens test validity. Although conventional approaches, such as the logistic regression (LR) and the Mantel-Haenszel (MH) methods, have worked well, they are vulnerable to high percentages of DIF items in a test and missing data. This study developed a simple but effective method to detect DIF using the odds ratio (OR) of two groups' responses to a studied item. The OR method uses all available information from examinees' responses, and it can eliminate the potential influence of bias in the total scores. Through a series of simulation studies in which the DIF pattern, impact, sample size (equal/unequal), purification procedure (with/without), percentages of DIF items, and proportions of missing data were manipulated, the performance of the OR method was evaluated and compared with the LR and MH methods. The results showed that the OR method without a purification procedure outperformed the LR and MH methods in controlling false positive rates and yielding high true positive rates when tests had a high percentage of DIF items favoring the same group. In addition, only the OR method was feasible when tests adopted the item matrix sampling design. The effectiveness of the OR method with an empirical example was illustrated.
项目功能差异(DIF)会使考试分数失去可比性,并严重威胁考试效度。尽管传统方法,如逻辑回归(LR)和曼特尔 - 亨塞尔(MH)方法,效果良好,但它们容易受到考试中高比例DIF项目和缺失数据的影响。本研究开发了一种简单而有效的方法,利用两组对所研究项目的回答的比值比(OR)来检测DIF。OR方法利用了考生回答中的所有可用信息,并且可以消除总分中偏差的潜在影响。通过一系列模拟研究,对DIF模式、影响、样本量(相等/不相等)、净化程序(有/无)、DIF项目百分比和缺失数据比例进行了操控,评估了OR方法的性能,并与LR和MH方法进行了比较。结果表明,在考试中存在高比例偏向同一组的DIF项目时,未采用净化程序的OR方法在控制假阳性率和产生高真阳性率方面优于LR和MH方法。此外,当考试采用项目矩阵抽样设计时,只有OR方法可行。通过一个实证例子说明了OR方法的有效性。