基于模型的响应偏差检测和量化方法

Model-Based Measures for Detecting and Quantifying Response Bias.

机构信息

Department of Educational Psychology, The University of Georgia, 323 Aderhold Hall, Athens, GA , 30602, USA.

出版信息

Psychometrika. 2018 Sep;83(3):696-732. doi: 10.1007/s11336-018-9626-9. Epub 2018 Jun 15.

Abstract

This paper proposes a model-based family of detection and quantification statistics to evaluate response bias in item bundles of any size. Compensatory (CDRF) and non-compensatory (NCDRF) response bias measures are proposed, along with their sample realizations and large-sample variability when models are fitted using multiple-group estimation. Based on the underlying connection to item response theory estimation methodology, it is argued that these new statistics provide a powerful and flexible approach to studying response bias for categorical response data over and above methods that have previously appeared in the literature. To evaluate their practical utility, CDRF and NCDRF are compared to the closely related SIBTEST family of statistics and likelihood-based detection methods through a series of Monte Carlo simulations. Results indicate that the new statistics are more optimal effect size estimates of marginal response bias than the SIBTEST family, are competitive with a selection of likelihood-based methods when studying item-level bias, and are the most optimal when studying differential bundle and test bias.

摘要

本文提出了一种基于模型的检测和量化统计家族，用于评估任何大小的项目束中的响应偏差。同时提出了补偿性（CDRF）和非补偿性（NCDRF）响应偏差度量，以及在使用多组估计拟合模型时它们的样本实现和大样本变异性。基于与项目反应理论估计方法的内在联系，有人认为这些新的统计数据为研究分类响应数据的响应偏差提供了一种强大而灵活的方法，超过了文献中以前出现的方法。为了评估它们的实际效用，通过一系列蒙特卡罗模拟，将 CDRF 和 NCDRF 与密切相关的 SIBTEST 统计家族和基于似然的检测方法进行了比较。结果表明，新的统计数据是边际响应偏差的更优效应量估计，比 SIBTEST 统计家族更优，在研究项目级偏差时与一些基于似然的方法具有竞争力，而在研究差异束和测试偏差时则是最优的。

相似文献

Model-Based Measures for Detecting and Quantifying Response Bias.

Psychometrika. 2018 Sep;83(3):696-732. doi: 10.1007/s11336-018-9626-9. Epub 2018 Jun 15.

Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF.

Psychometrika. 2018 Jun;83(2):376-386. doi: 10.1007/s11336-017-9583-8. Epub 2017 Aug 22.

It Might Not Make a Big DIF: Improved Differential Test Functioning Statistics That Account for Sampling Variability.

Educ Psychol Meas. 2016 Feb;76(1):114-140. doi: 10.1177/0013164415584576. Epub 2015 Jun 29.

Modeling and Testing Differential Item Functioning in Unidimensional Binary Item Response Models with a Single Continuous Covariate: A Functional Data Analysis Approach.

Psychometrika. 2016 Jun;81(2):371-98. doi: 10.1007/s11336-015-9473-x. Epub 2015 Jul 9.

A Monte Carlo study of the impact of missing data and differential item functioning on theta estimates from two polytomous Rasch family models.

J Appl Meas. 2007;8(4):388-403.

The Power of Crossing SIBTEST.

Appl Psychol Meas. 2020 Jul;44(5):393-408. doi: 10.1177/0146621620909907. Epub 2020 Mar 14.

Testing Differential Item Functioning in Small Samples.

Multivariate Behav Res. 2020 Sep-Oct;55(5):722-747. doi: 10.1080/00273171.2019.1671162. Epub 2019 Oct 4.

An extension of a bayesian approach to detect differential item functioning.

J Appl Meas. 2013;14(2):149-58.

Robust maximum marginal likelihood (RMML) estimation for item response theory models.

Behav Res Methods. 2019 Apr;51(2):573-588. doi: 10.3758/s13428-018-1150-4.

Score-Based Tests of Differential Item Functioning via Pairwise Maximum Likelihood Estimation.

Psychometrika. 2018 Mar;83(1):132-155. doi: 10.1007/s11336-017-9591-8. Epub 2017 Nov 17.

引用本文的文献

Including Empirical Prior Information in the Reliable Change Index.

Appl Psychol Meas. 2025 Jul 10:01466216251358492. doi: 10.1177/01466216251358492.

Initial development and validation of item banks to measure problematic hypersexuality.

Open Res Eur. 2024 Nov 22;3:129. doi: 10.12688/openreseurope.16131.2. eCollection 2023.

Remote Testing of Reading Comprehension in 8-Year-Old Children: Mode and Setting Effects.

Assessment. 2024 Mar;31(2):248-262. doi: 10.1177/10731911231159369. Epub 2023 Mar 8.

Item response theory and differential test functioning analysis of the HBSC-Symptom-Checklist across 46 countries.

BMC Med Res Methodol. 2022 Sep 29;22(1):253. doi: 10.1186/s12874-022-01698-3.

An R toolbox for score-based measurement invariance tests in IRT models.

Behav Res Methods. 2022 Oct;54(5):2101-2113. doi: 10.3758/s13428-021-01689-0. Epub 2021 Dec 16.

Differential Item Functioning Analyses of the Patient-Reported Outcomes Measurement Information System (PROMIS®) Measures: Methods, Challenges, Advances, and Future Directions.

Psychometrika. 2021 Sep;86(3):674-711. doi: 10.1007/s11336-021-09775-0. Epub 2021 Jul 12.

Examining the measurement equivalence of the Maslach Burnout Inventory across age, gender, and specialty groups in US physicians.

J Patient Rep Outcomes. 2021 Jun 5;5(1):43. doi: 10.1186/s41687-021-00312-2.

More flexible response functions for the PROMIS physical functioning item bank by application of a monotonic polynomial approach.

Qual Life Res. 2022 Jan;31(1):37-47. doi: 10.1007/s11136-021-02873-7. Epub 2021 May 27.

本文引用的文献

It Might Not Make a Big DIF: Improved Differential Test Functioning Statistics That Account for Sampling Variability.

Educ Psychol Meas. 2016 Feb;76(1):114-140. doi: 10.1177/0013164415584576. Epub 2015 Jun 29.

Numerical approximation of the observed information matrix with Oakes' identity.

Br J Math Stat Psychol. 2018 Nov;71(3):415-436. doi: 10.1111/bmsp.12127. Epub 2018 Jan 9.

Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF.

Psychometrika. 2018 Jun;83(2):376-386. doi: 10.1007/s11336-017-9583-8. Epub 2017 Aug 22.

Profile-likelihood Confidence Intervals in Item Response Theory Models.

Multivariate Behav Res. 2017 Sep-Oct;52(5):533-550. doi: 10.1080/00273171.2017.1329082. Epub 2017 Jun 8.

A basis for analyzing test-retest reliability.

Psychometrika. 1945;10:255-82. doi: 10.1007/BF02288892.

A taxonomy of effect size measures for the differential functioning of items and scales.

J Appl Psychol. 2010 Jul;95(4):728-43. doi: 10.1037/a0018966.

Examining the effects of differential item (functioning and differential) test functioning on selection decisions: when are statistically significant effects practically important?

J Appl Psychol. 2004 Jun;89(3):497-508. doi: 10.1037/0021-9010.89.3.497.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于模型的响应偏差检测和量化方法

Model-Based Measures for Detecting and Quantifying Response Bias.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献