Suppr超能文献

具有评分者效应的考生自选项目的项目反应理论建模

Item Response Theory Modeling for Examinee-selected Items with Rater Effect.

作者信息

Liu Chen-Wei, Qiu Xue-Lan, Wang Wen-Chung

机构信息

The Chinese University of Hong Kong, Sha Tin, Hong Kong.

The Education University of Hong Kong, Tai Po, Hong Kong.

出版信息

Appl Psychol Meas. 2019 Sep;43(6):435-448. doi: 10.1177/0146621618798667. Epub 2018 Oct 8.

Abstract

Some large-scale testing requires examinees to select and answer a fixed number of items from given items (e.g., select one out of the three items). Usually, they are constructed-response items that are marked by human raters. In this examinee-selected item (ESI) design, some examinees may benefit more than others from choosing easier items to answer, and so the missing data induced by the design become missing not at random (MNAR). Although item response theory (IRT) models have recently been developed to account for MNAR data in the ESI design, they do not consider the rater effect; thus, their utility is seriously restricted. In this study, two methods are developed: the first one is a new IRT model to account for both MNAR data and rater severity simultaneously, and the second one adapts conditional maximum likelihood estimation and pairwise estimation methods to the ESI design with the rater effect. A series of simulations was then conducted to compare their performance with those of conventional IRT models that ignored MNAR data or rater severity. The results indicated a good parameter recovery for the new model. The conditional maximum likelihood estimation and pairwise estimation methods were applicable when the Rasch models fit the data, but the conventional IRT models yielded biased parameter estimates. An empirical example was given to illustrate these new initiatives.

摘要

一些大规模测试要求考生从给定的题目中选择并回答固定数量的题目(例如,从三个题目中选择一个)。通常,这些题目是建构反应式题目,由人工评分员打分。在这种考生选择题目(ESI)设计中,一些考生可能比其他考生更受益于选择更容易的题目来回答,因此这种设计导致的缺失数据成为非随机缺失(MNAR)。尽管最近已经开发了项目反应理论(IRT)模型来处理ESI设计中的MNAR数据,但它们没有考虑评分员效应;因此,它们的效用受到严重限制。在本研究中,开发了两种方法:第一种是一种新的IRT模型,用于同时处理MNAR数据和评分员的严格程度,第二种是将条件最大似然估计和成对估计方法应用于具有评分员效应的ESI设计。然后进行了一系列模拟,以将它们的性能与忽略MNAR数据或评分员严格程度的传统IRT模型的性能进行比较。结果表明新模型具有良好的参数恢复能力。当Rasch模型拟合数据时,条件最大似然估计和成对估计方法适用,但传统IRT模型产生有偏差的参数估计。给出了一个实证例子来说明这些新方法。

相似文献

8
A General Unfolding IRT Model for Multiple Response Styles.一种适用于多种反应风格的通用展开IRT模型。
Appl Psychol Meas. 2019 May;43(3):195-210. doi: 10.1177/0146621618762743. Epub 2018 Apr 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验