Suppr超能文献

具有维度内混杂规范的等效测试形式间的单维项目反应理论项目参数估计

Unidimensional IRT Item Parameter Estimates Across Equivalent Test Forms With Confounding Specifications Within Dimensions.

作者信息

Matlock Ki Lynn, Turner Ronna

机构信息

Oklahoma State University, Stillwater, OK, USA.

University of Arkansas, Fayetteville, AR, USA.

出版信息

Educ Psychol Meas. 2016 Apr;76(2):258-279. doi: 10.1177/0013164415589756. Epub 2015 Jun 9.

Abstract

When constructing multiple test forms, the number of items and the total test difficulty are often equivalent. Not all test developers match the number of items and/or average item difficulty within subcontent areas. In this simulation study, six test forms were constructed having an equal number of items and average item difficulty overall. Manipulated variables were the number of items and average item difficulty within subsets of items primarily measuring one of two dimensions. Data sets were simulated at four levels of correlation (0, .3, .6, and .9). Item parameters were estimated using the Rasch and two-parameter logistic unidimensional item response theory models. Estimated discrimination and difficulty were compared across forms and to the true item parameters. The average unidimensional estimated discrimination was consistent across forms having the same correlation. Forms having a larger set of easy items measuring one dimension were estimated as being more difficult than forms having a larger set of hard items. Estimates were also investigated within subsets of items, and measures of bias were reported. This study encourages test developers to not only maintain consistent test specifications across forms as a whole but also within subcontent areas.

摘要

构建多个测试形式时,题目数量和总体测试难度通常是相等的。并非所有测试开发者都会使子内容领域内的题目数量和/或平均题目难度相匹配。在这项模拟研究中,构建了六个测试形式,它们的题目数量相等且总体平均题目难度相同。被操纵的变量是主要测量两个维度之一的题目子集中的题目数量和平均题目难度。数据集在四个相关水平(0、0.3、0.6和0.9)下进行模拟。使用拉施模型和两参数逻辑斯蒂单维题目反应理论模型估计题目参数。将估计的区分度和难度在不同形式之间进行比较,并与真实题目参数进行比较。在具有相同相关性的形式中,平均单维估计区分度是一致的。在测量一个维度时,拥有更多简单题目的形式比拥有更多难题目的形式被估计为更难。还在题目子集中对估计值进行了研究,并报告了偏差度量。这项研究鼓励测试开发者不仅要在整个形式上保持一致的测试规范,还要在子内容领域内保持一致。

相似文献

1
Unidimensional IRT Item Parameter Estimates Across Equivalent Test Forms With Confounding Specifications Within Dimensions.
Educ Psychol Meas. 2016 Apr;76(2):258-279. doi: 10.1177/0013164415589756. Epub 2015 Jun 9.
2
Comparing Traditional and IRT Scoring of Forced-Choice Tests.
Appl Psychol Meas. 2015 Nov;39(8):598-612. doi: 10.1177/0146621615585851. Epub 2015 May 19.
3
Using Multidimensional Scaling to Assess the Dimensionality of Dichotomous Item Data.
Multivariate Behav Res. 2000 Apr 1;35(2):229-59. doi: 10.1207/S15327906MBR3502_4.
4
Investigating the Impact of Item Parameter Drift for Item Response Theory Models with Mixture Distributions.
Front Psychol. 2016 Feb 24;7:255. doi: 10.3389/fpsyg.2016.00255. eCollection 2016.
5
Investigating the Practical Consequences of Model Misfit in Unidimensional IRT Models.
Appl Psychol Meas. 2017 Sep;41(6):439-455. doi: 10.1177/0146621617695522. Epub 2017 Mar 17.
7
Item Response Theory Modeling of the Verb Naming Test.
J Speech Lang Hear Res. 2023 May 9;66(5):1718-1739. doi: 10.1044/2023_JSLHR-22-00458. Epub 2023 Mar 31.
8
Parameter Recovery in Multidimensional Item Response Theory Models Under Complexity and Nonnormality.
Appl Psychol Meas. 2017 Oct;41(7):530-544. doi: 10.1177/0146621617707507. Epub 2017 May 11.
10
The relationship between classical item characteristics and item response time on computer-based testing.
Korean J Med Educ. 2019 Mar;31(1):1-9. doi: 10.3946/kjme.2019.113. Epub 2019 Mar 1.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验