• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait Inference.用于技术增强创新型项目的多分类测试题组反应模型:对模型拟合和特质推断的影响
Educ Psychol Meas. 2022 Aug;82(4):811-838. doi: 10.1177/00131644211032261. Epub 2021 Aug 2.
2
Computerized adaptive testing for testlet-based innovative items.基于测试单元的创新项目的计算机化自适应测试。
Br J Math Stat Psychol. 2022 Feb;75(1):136-157. doi: 10.1111/bmsp.12252. Epub 2021 Aug 30.
3
An Extension of Testlet-Based Equating to the Polytomous Testlet Response Theory Model.基于题组的等值方法向多值题组反应理论模型的扩展
Front Psychol. 2022 Jan 12;12:743362. doi: 10.3389/fpsyg.2021.743362. eCollection 2021.
4
A comparison of three polytomous item response theory models in the context of testlet scoring.在分块计分背景下三种多分类项目反应理论模型的比较
J Outcome Meas. 1999;3(1):1-20.
5
Polytomous multilevel testlet models for testlet-based assessments with complex sampling designs.用于具有复杂抽样设计的基于测验题组评估的多分类多级测验题组模型。
Br J Math Stat Psychol. 2015 Feb;68(1):65-83. doi: 10.1111/bmsp.12035. Epub 2014 Feb 27.
6
Multidimensional item response theory models for testlet-based doubly bounded data.基于测试单元的双重边界数据的多维项目反应理论模型。
Behav Res Methods. 2024 Sep;56(6):5309-5353. doi: 10.3758/s13428-023-02272-5. Epub 2023 Nov 20.
7
Testlet-Based Multidimensional Adaptive Testing.基于测试集的多维自适应测试。
Front Psychol. 2016 Nov 18;7:1758. doi: 10.3389/fpsyg.2016.01758. eCollection 2016.
8
F-type testlets and the effects of feedback and case-specificity.F 型测试单元以及反馈和案例特异性的影响。
Acad Med. 2011 Oct;86(10 Suppl):S55-8; quiz S58. doi: 10.1097/ACM.0b013e31822a6aa2.
9
Modeling Rapid Guessing Behaviors in Computer-Based Testlet Items.基于计算机的分块试题中快速猜测行为的建模
Appl Psychol Meas. 2023 Jan;47(1):19-33. doi: 10.1177/01466216221125177. Epub 2022 Sep 9.
10
Barthel Index of activities of daily living: item response theory analysis of ratings for long-term care residents.日常生活活动能力巴氏指数:长期护理居民评分的项目反应理论分析
Nurs Res. 2015 Mar-Apr;64(2):88-99. doi: 10.1097/NNR.0000000000000072.

引用本文的文献

1
Evaluating the Performance of a Regularized Differential Item Functioning Method for Testlet-Based Polytomous Items.评估基于测验题组的多值项目的正则化差异项目功能方法的性能。
Educ Psychol Meas. 2025 May 31:00131644251342512. doi: 10.1177/00131644251342512.
2
Development of Immediate Self-Feedback Very Short Answer Questions: Implementing Testlet Response Theory in Formative Examinations Across Multiple Occasions.即时自我反馈型简答题的开发:在多次形成性考试中应用组题反应理论
Med Sci Educ. 2024 Sep 18;35(1):205-217. doi: 10.1007/s40670-024-02167-w. eCollection 2025 Feb.
3
Location-Matching Adaptive Testing for Polytomous Technology-Enhanced Items.针对多分类技术增强型试题的位置匹配自适应测试
Appl Psychol Meas. 2024 Mar;48(1-2):57-76. doi: 10.1177/01466216241227548. Epub 2024 Jan 16.

本文引用的文献

1
Evaluating Different Scoring Methods for Multiple Response Items Providing Partial Credit.评估针对提供部分分数的多项选择题的不同评分方法。
Educ Psychol Meas. 2022 Feb;82(1):151-176. doi: 10.1177/0013164421994636. Epub 2021 Feb 22.
2
A Two-Level Alternating Direction Model for Polytomous Items With Local Dependence.一种用于具有局部依赖性的多分类项目的两级交替方向模型。
Educ Psychol Meas. 2020 Apr;80(2):293-311. doi: 10.1177/0013164419871597. Epub 2019 Sep 3.
3
Goodness of Fit in Item Response Models.项目反应模型中的拟合优度
Multivariate Behav Res. 1995 Jan 1;30(1):23-40. doi: 10.1207/s15327906mbr3001_2.
4
A Unidimensional Latent Trait Model for Continuous Item Responses.一维潜特质连续项目反应模型。
Multivariate Behav Res. 1994 Jul 1;29(3):223-36. doi: 10.1207/s15327906mbr2903_2.
5
Effects of varying magnitude and patterns of response dependence in the unidimensional Rasch model.单维Rasch模型中不同量级和反应依赖模式的影响
J Appl Meas. 2008;9(2):105-24.
6
The effect of ignoring item interactions on the estimated discrimination parameters in item response theory.在项目反应理论中,忽略项目间交互作用对估计的区分参数的影响。
Psychol Methods. 2001 Jun;6(2):181-95. doi: 10.1037/1082-989x.6.2.181.

用于技术增强创新型项目的多分类测试题组反应模型:对模型拟合和特质推断的影响

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait Inference.

作者信息

Kang Hyeon-Ah, Han Suhwa, Kim Doyoung, Kao Shu-Chuan

机构信息

University of Texas at Austin, Austin, TX, USA.

National Council of State Boards of Nursing, Chicago, IL, USA.

出版信息

Educ Psychol Meas. 2022 Aug;82(4):811-838. doi: 10.1177/00131644211032261. Epub 2021 Aug 2.

DOI:10.1177/00131644211032261
PMID:35754615
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9228694/
Abstract

The development of technology-enhanced innovative items calls for practical models that can describe polytomous testlet items. In this study, we evaluate four measurement models that can characterize polytomous items administered in testlets: (a) generalized partial credit model (GPCM), (b) testlet-as-a-polytomous-item model (TPIM), (c) random-effect testlet model (RTM), and (d) fixed-effect testlet model (FTM). Using data from GPCM, FTM, and RTM, we examine performance of the scoring models in multiple aspects: relative model fit, absolute item fit, significance of testlet effects, parameter recovery, and classification accuracy. The empirical analysis suggests that relative performance of the models varies substantially depending on the testlet-effect type, effect size, and trait estimator. When testlets had no or fixed effects, GPCM and FTM led to most desirable measurement outcomes. When testlets had random interaction effects, RTM demonstrated best model fit and yet showed substantially different performance in the trait recovery depending on the estimator. In particular, the advantage of RTM as a scoring model was discernable only when there existed strong random effects and the trait levels were estimated with Bayes priors. In other settings, the simpler models (i.e., GPCM, FTM) performed better or comparably. The study also revealed that polytomous scoring of testlet items has limited prospect as a functional scoring method. Based on the outcomes of the empirical evaluation, we provide practical guidelines for choosing a measurement model for polytomous innovative items that are administered in testlets.

摘要

技术增强型创新项目的发展需要能够描述多分类题组项目的实用模型。在本研究中,我们评估了四种能够刻画题组中多分类项目的测量模型:(a)广义部分计分模型(GPCM),(b)题组作为多分类项目模型(TPIM),(c)随机效应题组模型(RTM),以及(d)固定效应题组模型(FTM)。利用来自GPCM、FTM和RTM的数据,我们从多个方面检验了计分模型的表现:相对模型拟合度、绝对项目拟合度、题组效应的显著性、参数恢复以及分类准确性。实证分析表明,模型的相对表现会因题组效应类型、效应大小和特质估计方法的不同而有很大差异。当题组没有效应或有固定效应时,GPCM和FTM能带来最理想的测量结果。当题组有随机交互效应时,RTM表现出最佳的模型拟合度,但根据估计方法的不同,其在特质恢复方面的表现有很大差异。特别是,只有当存在强烈的随机效应且特质水平采用贝叶斯先验估计时,RTM作为计分模型的优势才明显。在其他情况下,更简单的模型(即GPCM、FTM)表现更好或相当。该研究还表明,题组项目的多分类计分作为一种功能计分方法前景有限。基于实证评估的结果,我们为选择用于题组中多分类创新项目的测量模型提供了实用指南。