Joo Seang-Hwane, Lee Philseok, Stark Stephen
The University of Kansas, Lawrence, KS, USA.
George Mason University, Fairfax, VA, USA.
Appl Psychol Meas. 2022 Mar;46(2):98-115. doi: 10.1177/01466216211066606. Epub 2022 Feb 10.
Differential item functioning (DIF) analysis is one of the most important applications of item response theory (IRT) in psychological assessment. This study examined the performance of two Bayesian DIF methods, Bayes factor (BF) and deviance information criterion (DIC), with the generalized graded unfolding model (GGUM). The Type I error and power were investigated in a Monte Carlo simulation that manipulated sample size, DIF source, DIF size, DIF location, subpopulation trait distribution, and type of baseline model. We also examined the performance of two likelihood-based methods, the likelihood ratio (LR) test and Akaike information criterion (AIC), using marginal maximum likelihood (MML) estimation for comparison with past DIF research. The results indicated that the proposed BF and DIC methods provided well-controlled Type I error and high power using a free-baseline model implementation, their performance was superior to LR and AIC in terms of Type I error rates when the reference and focal group trait distributions differed. The implications and recommendations for applied research are discussed.
项目功能差异(DIF)分析是项目反应理论(IRT)在心理评估中最重要的应用之一。本研究使用广义分级展开模型(GGUM)检验了两种贝叶斯DIF方法——贝叶斯因子(BF)和偏差信息准则(DIC)的性能。在一个蒙特卡洛模拟中研究了I型错误率和检验功效,该模拟对样本量、DIF来源、DIF大小、DIF位置、亚群体特质分布和基线模型类型进行了操控。我们还使用边际极大似然估计(MML)来检验两种基于似然的方法——似然比(LR)检验和赤池信息准则(AIC)的性能,以便与以往的DIF研究进行比较。结果表明,所提出的BF和DIC方法在使用自由基线模型实现时,能很好地控制I型错误率并具有较高的检验功效,当参照组和目标组特质分布不同时,它们在I型错误率方面的表现优于LR和AIC。讨论了对应用研究的启示和建议。