尺度长度确实很重要：类别因素分析和项目反应理论方法的测量不变性检验建议。

Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches.

机构信息

Department of Methodology and Statistics, School of Social and Behavioral Sciences, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands.

出版信息

Behav Res Methods. 2022 Oct;54(5):2114-2145. doi: 10.3758/s13428-021-01690-7. Epub 2021 Dec 15.

DOI:10.3758/s13428-021-01690-7

PMID:34910286

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9579096/

Abstract

In social sciences, the study of group differences concerning latent constructs is ubiquitous. These constructs are generally measured by means of scales composed of ordinal items. In order to compare these constructs across groups, one crucial requirement is that they are measured equivalently or, in technical jargon, that measurement invariance (MI) holds across the groups. This study compared the performance of scale- and item-level approaches based on multiple group categorical confirmatory factor analysis (MG-CCFA) and multiple group item response theory (MG-IRT) in testing MI with ordinal data. In general, the results of the simulation studies showed that MG-CCFA-based approaches outperformed MG-IRT-based approaches when testing MI at the scale level, whereas, at the item level, the best performing approach depends on the tested parameter (i.e., loadings or thresholds). That is, when testing loadings equivalence, the likelihood ratio test provided the best trade-off between true-positive rate and false-positive rate, whereas, when testing thresholds equivalence, the χ test outperformed the other testing strategies. In addition, the performance of MG-CCFA's fit measures, such as RMSEA and CFI, seemed to depend largely on the length of the scale, especially when MI was tested at the item level. General caution is recommended when using these measures, especially when MI is tested for each item individually.

摘要

在社会科学中，研究潜在结构的群体差异是普遍存在的。这些结构通常通过由序数项目组成的量表来测量。为了在群体之间比较这些结构，一个关键要求是它们在群体之间具有等效的测量，或者用技术术语来说，就是测量不变性（MI）成立。本研究比较了基于多群组分类验证性因子分析（MG-CCFA）和多群组项目反应理论（MG-IRT）的量表和项目级方法在测试序数量表数据的 MI 方面的性能。一般来说，模拟研究的结果表明，在测试量表水平的 MI 时，基于 MG-CCFA 的方法优于基于 MG-IRT 的方法，而在测试项目水平的 MI 时，表现最好的方法取决于测试的参数（即，负载或阈值）。也就是说，当测试负载等效性时，似然比检验在真阳性率和假阳性率之间提供了最佳的权衡，而当测试阈值等效性时，χ 检验优于其他测试策略。此外，MG-CCFA 的拟合度指标，如 RMSEA 和 CFI 的性能似乎在很大程度上取决于量表的长度，尤其是在测试项目水平的 MI 时。当逐个测试每个项目的 MI 时，建议谨慎使用这些指标。

相似文献

Scale length does matter: Recommendations for measurement invariance testing with categorical factor analysis and item response theory approaches.尺度长度确实很重要：类别因素分析和项目反应理论方法的测量不变性检验建议。

Behav Res Methods. 2022 Oct;54(5):2114-2145. doi: 10.3758/s13428-021-01690-7. Epub 2021 Dec 15.

Testing measurement invariance of the patient-reported outcomes measurement information system pain behaviors score between the US general population sample and a sample of individuals with chronic pain.检验患者报告结局测量信息系统疼痛行为评分在美国一般人群样本和慢性疼痛患者样本中的测量不变性。

Qual Life Res. 2014 Feb;23(1):239-44. doi: 10.1007/s11136-013-0463-0. Epub 2013 Jul 4.

Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning.改进测量不变性评估：使用正则化选择锚定项目并识别差异项目功能。

Psychol Methods. 2020 Dec;25(6):673-690. doi: 10.1037/met0000253. Epub 2020 Jan 9.

Development and Differential Item Functioning of the Internet Addiction Test-Revised (IAT-R): An Item Response Theory Approach.网络成瘾测试修订版（IAT-R）的发展和项目区分功能：项目反应理论方法。

Cyberpsychol Behav Soc Netw. 2020 May;23(5):312-328. doi: 10.1089/cyber.2019.0468. Epub 2020 Apr 15.

Modern psychometric methods for detection of differential item functioning: application to cognitive assessment measures.用于检测项目功能差异的现代心理测量方法：在认知评估测量中的应用。

Stat Med. 2000;19(11-12):1651-83. doi: 10.1002/(sici)1097-0258(20000615/30)19:11/12<1651::aid-sim453>3.0.co;2-h.

Measurement invariance testing of the PHQ-9 in a multi-ethnic population in Europe: the HELIUS study.PHQ-9 在欧洲多民族人群中的测量不变性检验：HELIUS 研究。

BMC Psychiatry. 2017 Oct 24;17(1):349. doi: 10.1186/s12888-017-1506-9.

Gender-based measurement invariance of the substance use risk profile scale.基于性别的物质使用风险概况量表的测量不变性。

Addict Behav. 2014 Mar;39(3):690-4. doi: 10.1016/j.addbeh.2013.10.016. Epub 2013 Nov 7.

Does strict invariance matter? Valid group mean comparisons with ordered-categorical items.严格不变性重要吗？有序分类项目的有效组均值比较。

Behav Res Methods. 2024 Apr;56(4):3117-3139. doi: 10.3758/s13428-023-02247-6. Epub 2023 Nov 29.

A more general model for testing measurement invariance and differential item functioning.更一般的测量不变性和项目区分功能检验模型。

Psychol Methods. 2017 Sep;22(3):507-526. doi: 10.1037/met0000077. Epub 2016 Jun 6.

Measurement invariance across educational levels and gender in 12-item Zarit Burden Interview (ZBI) on caregivers of people with dementia.12 项 Zarit 负担访谈（ZBI）在痴呆患者照顾者中，跨教育水平和性别具有测量不变性。

Int Psychogeriatr. 2017 Nov;29(11):1841-1848. doi: 10.1017/S1041610217001417. Epub 2017 Aug 1.

引用本文的文献

Anticipatory emotions and academic performance: The role of boredom in a preservice teachers' lab experience.预期情绪与学业成绩：无聊在职前教师实验室体验中的作用。

Heliyon. 2024 Dec 12;11(1):e41142. doi: 10.1016/j.heliyon.2024.e41142. eCollection 2025 Jan 15.

Assessment of adaptive behavior in people with intellectual disabilities: Design and development of a new test battery.智力障碍者适应性行为评估：一种新测试组合的设计与开发

Heliyon. 2024 May 10;10(10):e31048. doi: 10.1016/j.heliyon.2024.e31048. eCollection 2024 May 30.

How Scoring Approaches Impact Estimates of Growth in the Presence of Survey Item Ceiling Effects.在存在调查项目上限效应的情况下，评分方法如何影响增长估计。

Appl Psychol Meas. 2024 May;48(3):147-164. doi: 10.1177/01466216241238749. Epub 2024 Mar 16.

Psychometric properties of the Coronavirus Anxiety Scale based on Classical Test Theory (CTT) and Item Response Theory (IRT) models among Chinese front-line healthcare workers.基于经典测量理论（CTT）和项目反应理论（IRT）模型的冠状病毒焦虑量表在我国一线医护人员中的心理测量学特性。

BMC Psychol. 2023 Aug 7;11(1):224. doi: 10.1186/s40359-023-01251-x.

Measurement Invariance in Longitudinal Bifactor Models: Review and Application Based on the Factor.基于因素的纵向双因素模型测量不变性：综述与应用

Assessment. 2024 Jun;31(4):774-793. doi: 10.1177/10731911231182687. Epub 2023 Jun 22.

Longitudinal Invariance of the Strengths and Difficulties Questionnaire Across Ages 4 to 16 in the ALSPAC Sample.《ALSPAC 样本中 4 至 16 岁儿童青少年长处与困难问卷的纵向不变性》

Assessment. 2023 Sep;30(6):1884-1894. doi: 10.1177/10731911221128948. Epub 2022 Oct 18.

Invariance of the Household Food Insecurity Access Scale Across Different Groups of Adolescents and Young Adults.家庭粮食不安全获取量表在不同青少年和青年群体中的不变性。

Food Nutr Bull. 2021 Sep;42(3):437-450. doi: 10.1177/03795721211019634. Epub 2021 Jun 15.

Development of a coronavirus social distance attitudes scale.开发冠状病毒社交距离态度量表。

Patient Educ Couns. 2021 Jun;104(6):1451-1459. doi: 10.1016/j.pec.2020.11.027. Epub 2020 Nov 24.

本文引用的文献

Dynamic fit index cutoffs for confirmatory factor analysis models.验证性因子分析模型的动态适配指数截断值。

Psychol Methods. 2023 Feb;28(1):61-88. doi: 10.1037/met0000425. Epub 2021 Oct 25.

It Matters: Reference Indicator Selection in Measurement Invariance Tests.重要事项：测量不变性检验中的参考指标选择

Educ Psychol Meas. 2021 Feb;81(1):5-38. doi: 10.1177/0013164420926565. Epub 2020 Jun 5.

Impact of error structure misspecification when testing measurement invariance and latent-factor mean difference using MIMIC and multiple-group confirmatory factor analysis.使用 MIMIC 和多群组验证性因子分析检验测量不变性和潜在因子均值差异时，误差结构误设的影响。

Behav Res Methods. 2019 Dec;51(6):2688-2699. doi: 10.3758/s13428-018-1124-6.

RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods.结构方程模型中有序分类数据的 RMSEA、CFI 和 TLI：他们讲述的故事取决于估计方法。

Behav Res Methods. 2019 Feb;51(1):409-428. doi: 10.3758/s13428-018-1055-2.

Bayesian SEM for Specification Search Problems in Testing Factorial Invariance.用于检验因子不变性中规范搜索问题的贝叶斯结构方程模型

Multivariate Behav Res. 2017 Jul-Aug;52(4):430-444. doi: 10.1080/00273171.2017.1306432. Epub 2017 Apr 21.

Unifying Differential Item Functioning in Factor Analysis for Categorical Data Under a Discretization of a Normal Variant.在正态变异性的离散化下，对类别数据进行因子分析的统一差异项目功能。

Psychometrika. 2017 Jun;82(2):382-406. doi: 10.1007/s11336-017-9562-0. Epub 2017 Feb 17.

Measurement Invariance Conventions and Reporting: The State of the Art and Future Directions for Psychological Research.测量不变性的惯例与报告：心理学研究的现状与未来方向

Dev Rev. 2016 Sep;41:71-90. doi: 10.1016/j.dr.2016.06.004. Epub 2016 Jun 29.

Identification of Confirmatory Factor Analysis Models of Different Levels of Invariance for Ordered Categorical Outcomes.有序分类结果不同水平不变性的验证性因素分析模型的识别

Psychometrika. 2016 Dec;81(4):1014-1045. doi: 10.1007/s11336-016-9506-0. Epub 2016 Jul 11.

The consequences of ignoring measurement invariance for path coefficients in structural equation models.忽视结构方程模型中路径系数测量不变性的后果。

Front Psychol. 2014 Sep 17;5:980. doi: 10.3389/fpsyg.2014.00980. eCollection 2014.

Solving the measurement invariance anchor item problem in item response theory.解决项目反应理论中测量不变性锚定项目问题。

J Appl Psychol. 2012 Sep;97(5):1016-31. doi: 10.1037/a0027934. Epub 2012 Apr 2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验