Chen Wenya, Fujimoto Ken A
Loyola University Chicago, Chicago, IL, USA.
Appl Psychol Meas. 2022 Nov;46(8):675-689. doi: 10.1177/01466216221108133. Epub 2022 Jul 10.
Using the bifactor item response theory model to analyze data arising from educational and psychological studies has gained popularity over the years. Unfortunately, using this model in practice comes with challenges. One such challenge is an empirical identification issue that is seldom discussed in the literature, and its impact on the estimates of the bifactor model's parameters has not been demonstrated. This issue occurs when an item's discriminations on the general and specific dimensions are approximately equal (i.e., the within-item discriminations are similar in strength), leading to difficulties in obtaining unique estimates for those discriminations. We conducted three simulation studies to demonstrate that within-item discriminations being similar in strength creates problems in estimation stability. The results suggest that a large sample could alleviate but not resolve the problems, at least when considering sample sizes up to 4,000. When the discriminations within items were made clearly different, the estimates of these discriminations were more consistent across the data replicates than that observed when the discriminations within the items were similar. The results also show that the similarity of an item's discriminatory magnitudes on different dimensions has direct implications on the sample size needed in order to consistently obtain accurate parameter estimates. Although our goal was to provide evidence of the empirical identification issue, the study further reveals that the extent of similarity of within-item discriminations, the magnitude of discriminations, and how well the items are targeted to the respondents also play factors in the estimation of the bifactor model's parameters.
多年来,使用双因素项目反应理论模型来分析教育和心理学研究中产生的数据越来越受欢迎。不幸的是,在实践中使用该模型存在挑战。其中一个挑战是一个实证识别问题,在文献中很少被讨论,并且其对双因素模型参数估计的影响尚未得到证明。当一个项目在一般维度和特定维度上的区分度大致相等时(即项目内区分度在强度上相似),就会出现这个问题,导致难以获得这些区分度的唯一估计值。我们进行了三项模拟研究,以证明项目内区分度在强度上相似会在估计稳定性方面产生问题。结果表明,大样本可以缓解但不能解决这些问题,至少在考虑样本量达到4000时是这样。当项目内的区分度明显不同时,与项目内区分度相似时相比,这些区分度的估计值在数据重复中更一致。结果还表明,一个项目在不同维度上的区分度大小的相似性对为了一致地获得准确的参数估计所需的样本量有直接影响。虽然我们的目标是提供实证识别问题的证据,但该研究进一步揭示,项目内区分度的相似程度、区分度的大小以及项目针对受访者的程度在双因素模型参数估计中也起着作用。