Suppr超能文献

带协变量的项目反应理论(IRT-C):评估三参数逻辑斯蒂模型的项目恢复与项目功能差异

Item Response Theory With Covariates (IRT-C): Assessing Item Recovery and Differential Item Functioning for the Three-Parameter Logistic Model.

作者信息

Tay Louis, Huang Qiming, Vermunt Jeroen K

机构信息

Purdue University, West Lafayette, IN, USA.

Tilburg University, Tilburg, Netherlands.

出版信息

Educ Psychol Meas. 2016 Feb;76(1):22-42. doi: 10.1177/0013164415579488. Epub 2015 Apr 6.

Abstract

In large-scale testing, the use of multigroup approaches is limited for assessing differential item functioning (DIF) across multiple variables as DIF is examined for each variable separately. In contrast, the item response theory with covariate (IRT-C) procedure can be used to examine DIF across multiple variables (covariates) simultaneously. To assess the utility of the IRT-C procedure, we conducted a simulation study. Using SAT data for realistic parameters, uniform DIF on three covariates were simulated: gender (dichotomous), race/ethnicity (categorical), and income (continuous). Simulations were conducted across several conditions: two test lengths (14 items, 21 items), four sample sizes (5,000, 10,000, 20,000, 40,000), and two DIF effect sizes (medium, large). It was found that the IRT-C procedure could accurately recover the latent means and the three-parameter logistic model parameters well with a substantial sample size of 20,000. There was good control of Type I error rates to the nominal rates across the sample sizes. Good power to detect DIF across all covariates (>.80) was observed when the sample size was 20,000 for large DIF effect size and 40,000 for medium DIF effect size. Practical implications for the use of the IRT-C procedure are discussed.

摘要

在大规模测试中,多组方法在评估多个变量的项目功能差异(DIF)方面存在局限性,因为DIF是针对每个变量分别进行检验的。相比之下,带协变量的项目反应理论(IRT-C)程序可用于同时检验多个变量(协变量)的DIF。为了评估IRT-C程序的效用,我们进行了一项模拟研究。使用SAT数据的实际参数,模拟了三个协变量上的均匀DIF:性别(二分变量)、种族/民族(分类变量)和收入(连续变量)。在几种条件下进行了模拟:两种测试长度(14个项目、21个项目)、四个样本量(5000、10000、20000、40000)和两种DIF效应大小(中等、大)。结果发现,在样本量达到20000时,IRT-C程序能够很好地准确恢复潜在均值和三参数逻辑模型参数。在所有样本量下,第一类错误率都能很好地控制在名义水平。当大DIF效应大小的样本量为20000且中等DIF效应大小的样本量为40000时,观察到在所有协变量上检测DIF的能力良好(>.80)。本文还讨论了IRT-C程序使用的实际意义。

相似文献

2
Testing Differential Item Functioning in Small Samples.
Multivariate Behav Res. 2020 Sep-Oct;55(5):722-747. doi: 10.1080/00273171.2019.1671162. Epub 2019 Oct 4.
3
Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.
Stat Med. 2000;19(11-12):1651-83. doi: 10.1002/(sici)1097-0258(20000615/30)19:11/12<1651::aid-sim453>3.0.co;2-h.
6
Multidimensional Extension of Multiple Indicators Multiple Causes Models to Detect DIF.
Educ Psychol Meas. 2017 Aug;77(4):545-569. doi: 10.1177/0013164416651116. Epub 2016 May 25.
8
After Differential Item Functioning Is Detected: IRT Item Calibration and Scoring in the Presence of DIF.
Appl Psychol Meas. 2016 Nov;40(8):573-591. doi: 10.1177/0146621616664304. Epub 2016 Sep 24.
9
A Power Formula for the Mantel-Haenszel Test for Differential Item Functioning.
Appl Psychol Meas. 2015 Jul;39(5):373-388. doi: 10.1177/0146621614568805. Epub 2015 Feb 5.

引用本文的文献

1
DIF Analysis with Unknown Groups and Anchor Items.
Psychometrika. 2024 Mar;89(1):267-295. doi: 10.1007/s11336-024-09948-7. Epub 2024 Feb 21.
3
DIF Statistical Inference Without Knowing Anchoring Items.
Psychometrika. 2023 Dec;88(4):1097-1122. doi: 10.1007/s11336-023-09930-9. Epub 2023 Aug 7.
4
Differential Item Functioning Analysis Without A Priori Information on Anchor Items: QQ Plots and Graphical Test.
Psychometrika. 2021 Jun;86(2):345-377. doi: 10.1007/s11336-021-09746-5. Epub 2021 Mar 3.
5
The Bayesian Expectation-Maximization-Maximization for the 3PLM.
Front Psychol. 2019 May 31;10:1175. doi: 10.3389/fpsyg.2019.01175. eCollection 2019.
6
Investigating Measurement Invariance by Means of Parameter Instability Tests for 2PL and 3PL Models.
Educ Psychol Meas. 2019 Apr;79(2):385-398. doi: 10.1177/0013164418777784. Epub 2018 May 24.

本文引用的文献

1
How Item Residual Heterogeneity Affects Tests for Differential Item Functioning.
Appl Psychol Meas. 2015 Jun;39(4):251-263. doi: 10.1177/0146621614561313. Epub 2014 Dec 11.
2
Evaluation of MIMIC-Model Methods for DIF Testing With Comparison to Two-Group Analysis.
Multivariate Behav Res. 2009 Jan-Feb;44(1):1-27. doi: 10.1080/00273170802620121.
3
Illustration of MIMIC-Model DIF Testing with the Schedule for Nonadaptive and Adaptive Personality.
J Psychopathol Behav Assess. 2009;31(4):320-330. doi: 10.1007/s10862-008-9118-9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验