用于项目反应准确性和速度的项目反应模型比较及其在自适应测试中的应用

A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing.

作者信息

van Rijn Peter W, Ali Usama S

机构信息

ETS Global, Amsterdam, The Netherlands.

Educational Testing Service, Princeton, New Jersey, USA.

出版信息

Br J Math Stat Psychol. 2017 May;70(2):317-345. doi: 10.1111/bmsp.12101.

DOI:10.1111/bmsp.12101

PMID:28474769

Abstract

We compare three modelling frameworks for accuracy and speed of item responses in the context of adaptive testing. The first framework is based on modelling scores that result from a scoring rule that incorporates both accuracy and speed. The second framework is the hierarchical modelling approach developed by van der Linden (2007, Psychometrika, 72, 287) in which a regular item response model is specified for accuracy and a log-normal model for speed. The third framework is the diffusion framework in which the response is assumed to be the result of a Wiener process. Although the three frameworks differ in the relation between accuracy and speed, one commonality is that the marginal model for accuracy can be simplified to the two-parameter logistic model. We discuss both conditional and marginal estimation of model parameters. Models from all three frameworks were fitted to data from a mathematics and spelling test. Furthermore, we applied a linear and adaptive testing mode to the data off-line in order to determine differences between modelling frameworks. It was found that a model from the scoring rule framework outperformed a hierarchical model in terms of model-based reliability, but the results were mixed with respect to correlations with external measures.

摘要

我们比较了三种建模框架在自适应测试背景下项目反应的准确性和速度。第一个框架基于对由同时纳入准确性和速度的评分规则得出的分数进行建模。第二个框架是范德林登（2007年，《心理测量学》，72卷，287页）开发的分层建模方法，其中为准确性指定了一个常规项目反应模型，为速度指定了一个对数正态模型。第三个框架是扩散框架，其中假设反应是维纳过程的结果。尽管这三个框架在准确性和速度之间的关系上有所不同，但一个共同点是准确性的边际模型可以简化为双参数逻辑模型。我们讨论了模型参数的条件估计和边际估计。来自所有三个框架的模型都拟合了数学和拼写测试的数据。此外，我们对数据进行离线线性和自适应测试模式，以确定建模框架之间的差异。结果发现，评分规则框架的一个模型在基于模型的可靠性方面优于分层模型，但在与外部测量的相关性方面结果不一。

相似文献

A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing.

Br J Math Stat Psychol. 2017 May;70(2):317-345. doi: 10.1111/bmsp.12101.

A Generalized Speed-Accuracy Response Model for Dichotomous Items.

Psychometrika. 2018 Mar;83(1):109-131. doi: 10.1007/s11336-017-9590-9. Epub 2017 Nov 21.

A generalized linear factor model approach to the hierarchical framework for responses and response times.

Br J Math Stat Psychol. 2015 May;68(2):197-219. doi: 10.1111/bmsp.12042. Epub 2014 Aug 11.

Response moderation models for conditional dependence between response time and response accuracy.

Br J Math Stat Psychol. 2017 May;70(2):257-279. doi: 10.1111/bmsp.12076. Epub 2016 Sep 12.

Marginal likelihood inference for a model for item responses and response times.

Br J Math Stat Psychol. 2010 Nov;63(Pt 3):603-26. doi: 10.1348/000711009X481360. Epub 2010 Jan 28.

Robust estimation of the hierarchical model for responses and response times.

Br J Math Stat Psychol. 2019 Feb;72(1):83-107. doi: 10.1111/bmsp.12143. Epub 2018 Jul 27.

Modeling Differences Between Response Times of Correct and Incorrect Responses.

Psychometrika. 2019 Dec;84(4):1018-1046. doi: 10.1007/s11336-019-09682-5. Epub 2019 Aug 28.

Improving precision of ability estimation: Getting more from response times.

Br J Math Stat Psychol. 2018 Feb;71(1):13-38. doi: 10.1111/bmsp.12104. Epub 2017 Jun 21.

Characterizing the Manifest Probability Distributions of Three Latent Trait Models for Accuracy and Response Time.

Psychometrika. 2019 Sep;84(3):870-891. doi: 10.1007/s11336-019-09668-3. Epub 2019 Mar 27.

Spontaneous and imposed speed of cognitive test responses.

Br J Math Stat Psychol. 2017 May;70(2):225-237. doi: 10.1111/bmsp.12094. Epub 2017 Feb 3.

引用本文的文献

A Recent Development of a Network Approach to Assessment Data: Latent Space Item Response Modeling for Intelligence Studies.

J Intell. 2024 Mar 28;12(4):38. doi: 10.3390/jintelligence12040038.

Human ratings take time: A hierarchical facets model for the joint analysis of ratings and rating times.

Behav Res Methods. 2024 Apr;56(4):3535-3547. doi: 10.3758/s13428-023-02259-2. Epub 2023 Nov 2.

A Latent Space Diffusion Item Response Theory Model to Explore Conditional Dependence between Responses and Response Times.

Psychometrika. 2023 Sep;88(3):830-864. doi: 10.1007/s11336-023-09920-x. Epub 2023 Jun 14.

Testing Replicability and Generalizability of the Time on Task Effect.

J Intell. 2023 Apr 28;11(5):82. doi: 10.3390/jintelligence11050082.

The cyclical ethical effects of using artificial intelligence in education.

AI Soc. 2022 Sep 27:1-11. doi: 10.1007/s00146-022-01497-w.

Bridging Models of Biometric and Psychometric Assessment: A Three-Way Joint Modeling Approach of Item Responses, Response Times, and Gaze Fixation Counts.

Appl Psychol Meas. 2022 Jul;46(5):361-381. doi: 10.1177/01466216221089344. Epub 2022 May 27.

Modeling Conditional Dependence of Response Accuracy and Response Time with the Diffusion Item Response Theory Model.

Psychometrika. 2022 Jun;87(2):725-748. doi: 10.1007/s11336-021-09819-5. Epub 2022 Jan 6.

An Attention-Based Diffusion Model for Psychometric Analyses.

Psychometrika. 2021 Dec;86(4):938-972. doi: 10.1007/s11336-021-09783-0. Epub 2021 Jul 13.

Joint Modeling of Compensatory Multidimensional Item Responses and Response Times.

Appl Psychol Meas. 2019 Nov;43(8):639-654. doi: 10.1177/0146621618824853. Epub 2019 Feb 22.

Characterizing the Manifest Probability Distributions of Three Latent Trait Models for Accuracy and Response Time.

Psychometrika. 2019 Sep;84(3):870-891. doi: 10.1007/s11336-019-09668-3. Epub 2019 Mar 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于项目反应准确性和速度的项目反应模型比较及其在自适应测试中的应用

A comparison of item response models for accuracy and speed of item responses with applications to adaptive testing.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献