• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

应用混合多分类项目反应模型的样本量要求:蒙特卡罗模拟研究结果

Sample Size Requirements for Applying Mixed Polytomous Item Response Models: Results of a Monte Carlo Simulation Study.

作者信息

Kutscher Tanja, Eid Michael, Crayen Claudia

机构信息

Department of Education and Psychology, Freie Universitaet Berlin, Berlin, Germany.

Department of Data Center and Method Development, Leibniz Institute for Educational Trajectories, Bamberg, Germany.

出版信息

Front Psychol. 2019 Nov 13;10:2494. doi: 10.3389/fpsyg.2019.02494. eCollection 2019.

DOI:10.3389/fpsyg.2019.02494
PMID:31798490
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6863808/
Abstract

Mixture models of item response theory (IRT) can be used to detect inappropriate category use. Data collected by panel surveys where attitudes and traits are typically assessed by short scales with many response categories are prone to response styles indicating inappropriate category use. However, the application of mixed IRT models to this data type can be challenging because of many threshold parameters within items. Up to now, there is very limited knowledge about the sample size required for an appropriate performance of estimation methods as well as goodness-of-fit criteria of mixed IRT models in this case. The present Monte Carlo simulation study examined these issues for two mixed IRT models [the restricted mixed generalized partial credit model (rmGPCM) and the mixed partial credit model (mPCM)]. The population parameters of the simulation study were taken from a real application to survey data which is challenging (a 5-item scale with an 11-point rating scale, and three latent classes). Additional data conditions (e.g., long tests, a reduced number of response categories, and a simple latent mixture) were included in this simulation study to improve the generalizability of the results. Under this challenging data condition, for each model, data were generated based on varying sample sizes (from 500 to 5,000 observations with a 500-step). For the additional conditions, only three sample sizes (consisting of 1,000, 2,500, and 4,500 observations) were examined. The effect of sample size on estimation problems and accuracy of parameter and standard error estimates were evaluated. Results show that the two mixed IRT models require at least 2,500 observations to provide accurate parameter and standard error estimates under the challenging data condition. The rmGPCM produces more estimation problems than the more parsimonious mPCM, mostly because of the sparse tables arising due to many response categories. These models exhibit similar trends of estimation accuracy across sample sizes. Under the additional conditions, no estimation problems are observed. Both models perform well with a smaller sample size when long tests were used or a true latent mixture includes two classes. For model selection, the AIC3 and the SABIC are the most reliable information criteria.

摘要

项目反应理论(IRT)的混合模型可用于检测不恰当的类别使用情况。在面板调查中收集的数据,其中态度和特质通常通过具有多个反应类别的短量表进行评估,容易出现表明不恰当类别使用的反应方式。然而,由于项目内存在许多阈值参数,将混合IRT模型应用于这种数据类型可能具有挑战性。到目前为止,关于在这种情况下混合IRT模型的估计方法以及拟合优度标准的适当性能所需的样本量,人们的了解非常有限。本蒙特卡罗模拟研究针对两种混合IRT模型[受限混合广义部分计分模型(rmGPCM)和混合部分计分模型(mPCM)]研究了这些问题。模拟研究的总体参数取自对具有挑战性的调查数据的实际应用(一个5个项目的量表,采用11点量表,以及三个潜在类别)。本模拟研究还纳入了其他数据条件(例如,长测试、减少的反应类别数量和简单的潜在混合),以提高结果的普遍性。在这种具有挑战性的数据条件下,对于每个模型,根据不同的样本量(从500到5000个观测值,步长为500)生成数据。对于其他条件,仅检验了三个样本量(由1000、2500和4500个观测值组成)。评估了样本量对估计问题以及参数和标准误差估计准确性的影响。结果表明,在具有挑战性的数据条件下,这两种混合IRT模型至少需要2500个观测值才能提供准确的参数和标准误差估计。与更简约的mPCM相比,rmGPCM产生的估计问题更多,主要是因为许多反应类别导致的稀疏表格。这些模型在不同样本量下的估计准确性呈现出相似的趋势。在其他条件下,未观察到估计问题。当使用长测试或真实潜在混合包括两个类别时,两种模型在较小样本量下都表现良好。对于模型选择,AIC3和SABIC是最可靠的信息准则。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/34ac1ccd2dbd/fpsyg-10-02494-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/20c68b26a96d/fpsyg-10-02494-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/5279c744ab16/fpsyg-10-02494-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/2349f7a9c3f1/fpsyg-10-02494-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/222876564c73/fpsyg-10-02494-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/9e2b80f600f5/fpsyg-10-02494-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/34ac1ccd2dbd/fpsyg-10-02494-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/20c68b26a96d/fpsyg-10-02494-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/5279c744ab16/fpsyg-10-02494-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/2349f7a9c3f1/fpsyg-10-02494-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/222876564c73/fpsyg-10-02494-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/9e2b80f600f5/fpsyg-10-02494-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c17f/6863808/34ac1ccd2dbd/fpsyg-10-02494-g0006.jpg

相似文献

1
Sample Size Requirements for Applying Mixed Polytomous Item Response Models: Results of a Monte Carlo Simulation Study.应用混合多分类项目反应模型的样本量要求:蒙特卡罗模拟研究结果
Front Psychol. 2019 Nov 13;10:2494. doi: 10.3389/fpsyg.2019.02494. eCollection 2019.
2
The Impact of Test and Sample Characteristics on Model Selection and Classification Accuracy in the Multilevel Mixture IRT Model.测试与样本特征对多级混合IRT模型中模型选择及分类准确性的影响
Front Psychol. 2020 Feb 14;11:197. doi: 10.3389/fpsyg.2020.00197. eCollection 2020.
3
The Impact of Sample Size and Various Other Factors on Estimation of Dichotomous Mixture IRT Models.样本量及其他各种因素对二分混合IRT模型估计的影响
Educ Psychol Meas. 2023 Jun;83(3):520-555. doi: 10.1177/00131644221094325. Epub 2022 May 19.
4
Mixture Random-Effect IRT Models for Controlling Extreme Response Style on Rating Scales.用于控制量表极端反应风格的混合随机效应项目反应理论模型。
Front Psychol. 2016 Nov 2;7:1706. doi: 10.3389/fpsyg.2016.01706. eCollection 2016.
5
Using a Mixed IRT Model to Assess the Scale Usage in the Measurement of Job Satisfaction.使用混合IRT模型评估工作满意度测量中的量表使用情况。
Front Psychol. 2017 Jan 4;7:1998. doi: 10.3389/fpsyg.2016.01998. eCollection 2016.
6
An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models.二分混合IRT模型模型选择中使用的拟合指数评估
Educ Psychol Meas. 2024 Jun;84(3):481-509. doi: 10.1177/00131644231180529. Epub 2023 Jun 26.
7
Mixture IRT Model With a Higher-Order Structure for Latent Traits.具有潜在特质高阶结构的混合IRT模型
Educ Psychol Meas. 2017 Apr;77(2):275-304. doi: 10.1177/0013164416640327. Epub 2016 Apr 1.
8
Assessing fit of alternative unidimensional polytomous IRT models using posterior predictive model checking.使用后验预测模型检查评估替代单维多项 IRT 模型的拟合情况。
Psychol Methods. 2017 Jun;22(2):397-408. doi: 10.1037/met0000082. Epub 2016 May 30.
9
General mixture item response models with different item response structures: Exposition with an application to Likert scales.具有不同项目反应结构的通用混合项目反应模型:应用于李克特量表的阐述。
Behav Res Methods. 2018 Dec;50(6):2325-2344. doi: 10.3758/s13428-017-0997-0.
10
Online Calibration of Polytomous Items Under the Generalized Partial Credit Model.广义部分计分模型下多分类项目的在线校准
Appl Psychol Meas. 2016 Sep;40(6):434-450. doi: 10.1177/0146621616650406. Epub 2016 Jul 28.

引用本文的文献

1
Cultural adaptation and validation of the desire to avoid pregnancy scale in Brazil.巴西对避免怀孕意愿量表的文化调适与效度验证。
PLoS One. 2025 Jul 28;20(7):e0327553. doi: 10.1371/journal.pone.0327553. eCollection 2025.
2
Psychometric Evaluation of the Chinese Version of the Mental Health System Responsiveness Questionnaire for Psychiatric Outpatients: Classical Test Theory and Item Response Theory Approaches.精神科门诊患者心理健康系统反应性问卷中文版的心理测量学评价:经典测验理论与项目反应理论方法
Patient Prefer Adherence. 2025 Mar 24;19:729-740. doi: 10.2147/PPA.S503016. eCollection 2025.
3
An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models.

本文引用的文献

1
A Simulation Study on Methods of Correcting for the Effects of Extreme Response Style.极端反应风格影响校正方法的模拟研究
Educ Psychol Meas. 2016 Apr;76(2):304-324. doi: 10.1177/0013164415591848. Epub 2015 Jun 29.
2
Rasch Mixture Models for DIF Detection: A Comparison of Old and New Score Specifications.用于差异项目功能(DIF)检测的拉施克混合模型:新旧分数规范的比较
Educ Psychol Meas. 2015 Apr;75(2):208-234. doi: 10.1177/0013164414536183. Epub 2014 Jun 22.
3
Using a Mixed IRT Model to Assess the Scale Usage in the Measurement of Job Satisfaction.
二分混合IRT模型模型选择中使用的拟合指数评估
Educ Psychol Meas. 2024 Jun;84(3):481-509. doi: 10.1177/00131644231180529. Epub 2023 Jun 26.
4
Psychometric benefits of self-chosen rating scales over given rating scales.自选评分量表优于给定评分量表的心理测量效益。
Behav Res Methods. 2024 Oct;56(7):7440-7464. doi: 10.3758/s13428-024-02429-w. Epub 2024 May 6.
5
Establishing language and ethnic equivalence for health-related quality of life item banks and testing their efficiency via computerised adaptive testing simulations.建立与健康相关的生活质量量表的语言和民族等效性,并通过计算机自适应测试模拟测试其效率。
PLoS One. 2024 Feb 23;19(2):e0298141. doi: 10.1371/journal.pone.0298141. eCollection 2024.
6
The Impact of Sample Size and Various Other Factors on Estimation of Dichotomous Mixture IRT Models.样本量及其他各种因素对二分混合IRT模型估计的影响
Educ Psychol Meas. 2023 Jun;83(3):520-555. doi: 10.1177/00131644221094325. Epub 2022 May 19.
7
Evaluation of the Desire to Avoid Pregnancy Scale in the UK: a psychometric analysis including predictive validity.评估英国避免怀孕量表:包括预测效度的心理计量分析。
BMJ Open. 2022 Jul 25;12(7):e060287. doi: 10.1136/bmjopen-2021-060287.
8
Dimensionality and psychometric analysis of DLQI in a Brazilian population.巴西人群中 DLQI 的维度和心理测量学分析。
Health Qual Life Outcomes. 2020 Aug 5;18(1):268. doi: 10.1186/s12955-020-01523-9.
使用混合IRT模型评估工作满意度测量中的量表使用情况。
Front Psychol. 2017 Jan 4;7:1998. doi: 10.3389/fpsyg.2016.01998. eCollection 2016.
4
Mixture Random-Effect IRT Models for Controlling Extreme Response Style on Rating Scales.用于控制量表极端反应风格的混合随机效应项目反应理论模型。
Front Psychol. 2016 Nov 2;7:1706. doi: 10.3389/fpsyg.2016.01706. eCollection 2016.
5
Simultaneous Decision on the Number of Latent Clusters and Classes for Multilevel Latent Class Models.多级潜类别模型中潜在聚类数和类别的同时决策
Multivariate Behav Res. 2014 May-Jun;49(3):232-44. doi: 10.1080/00273171.2014.900431.
6
Reversed thresholds in partial credit models: a reason for collapsing categories?部分计分模型中的反向阈值:类别合并的一个原因?
Assessment. 2014 Dec;21(6):765-74. doi: 10.1177/1073191114530775. Epub 2014 Apr 30.
7
Rasch scalability of the somatosensory amplification scale: a mixture distribution approach.躯体感觉放大量表的 Rasch 可扩展性:混合分布方法。
J Psychosom Res. 2013 Jun;74(6):469-78. doi: 10.1016/j.jpsychores.2013.02.006. Epub 2013 Mar 8.
8
A Study of Rasch, partial credit, and rating scale model parameter recovery in WINSTEPS and jMetrik.一项关于WINSTEPS和jMetrik中拉施模型、部分计分模型及等级量表模型参数恢复的研究。
J Appl Meas. 2012;13(3):248-58.
9
Using the Mixed Rasch Model to analyze data from the beliefs and attitudes about memory survey.使用混合拉施模型分析来自记忆调查的信念和态度数据。
J Appl Meas. 2012;13(1):23-40.
10
A Mixture IRT Analysis of Risky Youth Behavior.风险青年行为的混合 IRT 分析。
Front Psychol. 2011 May 13;2:98. doi: 10.3389/fpsyg.2011.00098. eCollection 2011.