• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

贝叶斯多项 RASCH 模型与马尔可夫建模用于评分者严重偏差。

A Bayesian many-facet Rasch model with Markov modeling for rater severity drift.

机构信息

The University of Electro-Communications, Tokyo, Japan.

出版信息

Behav Res Methods. 2023 Oct;55(7):3910-3928. doi: 10.3758/s13428-022-01997-z. Epub 2022 Oct 25.

DOI:10.3758/s13428-022-01997-z
PMID:36284065
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10615980/
Abstract

Fair performance assessment requires consideration of the effects of rater severity on scoring. The many-facet Rasch model (MFRM), an item response theory model that incorporates rater severity parameters, has been widely used for this purpose. Although a typical MFRM assumes that rater severity does not change during the rating process, in actuality rater severity is known to change over time, a phenomenon called rater severity drift. To investigate this drift, several extensions of the MFRM have been proposed that incorporate time-specific rater severity parameters. However, these previous models estimate the severity parameters under the assumption of temporal independence. This introduces inefficiency into the parameter estimation because severities between adjacent time points tend to have temporal dependency in practice. To resolve this problem, we propose a Bayesian extension of the MFRM that incorporates time dependency for the rater severity parameters, based on a Markov modeling approach. The proposed model can improve the estimation accuracy of the time-specific rater severity parameters, resulting in improved estimation accuracy for the other rater parameters and for model fitting. We demonstrate the effectiveness of the proposed model through simulation experiments and application to actual data.

摘要

公平的绩效评估需要考虑评分者严厉程度对评分的影响。多方面 Rasch 模型(MFRM)是一种包含评分者严厉程度参数的项目反应理论模型,已被广泛用于此目的。尽管典型的 MFRM 假设评分者严厉程度在评分过程中不会改变,但实际上评分者严厉程度随着时间的推移而变化,这种现象称为评分者严厉程度漂移。为了研究这种漂移,已经提出了 MFRM 的几个扩展版本,这些扩展版本包含了特定于时间的评分者严厉程度参数。然而,这些先前的模型在时间独立性的假设下估计严厉程度参数。这会导致参数估计效率低下,因为实际上相邻时间点之间的严厉程度在实践中往往具有时间依赖性。为了解决这个问题,我们提出了一种基于马尔可夫建模方法的 MFRM 的贝叶斯扩展,该扩展为评分者严厉程度参数纳入了时间依赖性。所提出的模型可以提高特定于时间的评分者严厉程度参数的估计准确性,从而提高其他评分者参数和模型拟合的估计准确性。我们通过模拟实验和对实际数据的应用证明了所提出模型的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/4446c898f7b5/13428_2022_1997_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/d3922edd74d6/13428_2022_1997_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/d3f7175ec4e1/13428_2022_1997_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/6c958c960076/13428_2022_1997_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/edd479b55541/13428_2022_1997_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/82195cb642b7/13428_2022_1997_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/774d56b35551/13428_2022_1997_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/b4d90b1984a5/13428_2022_1997_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/ce4ad32e5a74/13428_2022_1997_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/e02250998584/13428_2022_1997_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/e5e75452dc59/13428_2022_1997_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/2b7948a8f218/13428_2022_1997_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/4446c898f7b5/13428_2022_1997_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/d3922edd74d6/13428_2022_1997_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/d3f7175ec4e1/13428_2022_1997_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/6c958c960076/13428_2022_1997_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/edd479b55541/13428_2022_1997_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/82195cb642b7/13428_2022_1997_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/774d56b35551/13428_2022_1997_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/b4d90b1984a5/13428_2022_1997_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/ce4ad32e5a74/13428_2022_1997_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/e02250998584/13428_2022_1997_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/e5e75452dc59/13428_2022_1997_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/2b7948a8f218/13428_2022_1997_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/271e/10615980/4446c898f7b5/13428_2022_1997_Fig12_HTML.jpg

相似文献

1
A Bayesian many-facet Rasch model with Markov modeling for rater severity drift.贝叶斯多项 RASCH 模型与马尔可夫建模用于评分者严重偏差。
Behav Res Methods. 2023 Oct;55(7):3910-3928. doi: 10.3758/s13428-022-01997-z. Epub 2022 Oct 25.
2
Are ratings in the eye of the beholder? A non-technical primer on many facet Rasch measurement to evaluate rater effects on teacher behavior rating scales.评分是仁者见仁智者见智吗?多方面 Rasch 测量在评价教师行为评定量表中评分者效应的非技术性概述。
J Sch Psychol. 2021 Jun;86:198-221. doi: 10.1016/j.jsp.2021.01.001. Epub 2021 May 14.
3
A comparative analysis of the ratings in performance assessment using generalizability theory and the many-facet Rasch model.使用概化理论和多面Rasch模型对绩效评估中的评分进行比较分析。
J Appl Meas. 2009;10(4):408-23.
4
Accuracy of performance-test linking based on a many-facet Rasch model.基于多方面 Rasch 模型的绩效测试链接准确性。
Behav Res Methods. 2021 Aug;53(4):1440-1454. doi: 10.3758/s13428-020-01498-x. Epub 2020 Nov 9.
5
Item response theory model highlighting rating scale of a rubric and rater-rubric interaction in objective structured clinical examination.项目反应理论模型突出了客观结构化临床考试中等级量表的评分和评分者-等级量表的交互作用。
PLoS One. 2024 Sep 6;19(9):e0309887. doi: 10.1371/journal.pone.0309887. eCollection 2024.
6
Detecting and measuring rater effects using many-facet Rasch measurement: part I.使用多面Rasch测量法检测和衡量评分者效应:第一部分。
J Appl Meas. 2003;4(4):386-422.
7
The presence and impact of local item dependence on objective structured clinical examinations scores and the potential use of the polytomous, many-facet Rasch model.局部项目依赖对客观结构化临床考试分数的影响及其存在情况,以及多值、多维度Rasch模型的潜在应用。
J Manipulative Physiol Ther. 2006 Oct;29(8):651-7. doi: 10.1016/j.jmpt.2006.08.002.
8
Using many-facet rasch measurement and generalizability theory to explore rater effects for direct behavior rating-multi-item scales.运用多面Rasch测量法和概化理论探究直接行为评定多项目量表的评分者效应。
Sch Psychol. 2023 Mar;38(2):119-128. doi: 10.1037/spq0000518. Epub 2022 Sep 29.
9
Using Generalizability Theory and Many-Facet Rasch Model to Evaluate In-Basket Tests for Managerial Positions.运用概化理论和多面Rasch模型评估管理职位的公文筐测试。
Front Psychol. 2021 Jul 29;12:660553. doi: 10.3389/fpsyg.2021.660553. eCollection 2021.
10
A many-facet Rasch measurement model approach to investigating objective structured clinical examination item parameter drift.一种用于研究客观结构化临床考试项目参数漂移的多维度Rasch测量模型方法。
J Eval Clin Pract. 2025 Feb;31(1):e14114. doi: 10.1111/jep.14114. Epub 2024 Jul 29.

引用本文的文献

1
A Judging Scheme for Large-Scale Innovative Class Competitions Based on Z-Score Pro Computational Model and BP Neural Network Model.一种基于Z分数概率计算模型和BP神经网络模型的大规模创新类竞赛评判方案。
Entropy (Basel). 2025 May 31;27(6):591. doi: 10.3390/e27060591.
2
When raters generalize: Examining sources of halo effects with mixture Rasch facets models.当评分者进行概括时:使用混合Rasch方面模型检验光环效应的来源。
Behav Res Methods. 2025 Apr 21;57(5):149. doi: 10.3758/s13428-025-02667-6.
3
Item response theory model highlighting rating scale of a rubric and rater-rubric interaction in objective structured clinical examination.

本文引用的文献

1
Stan: A Probabilistic Programming Language.斯坦:一种概率编程语言。
J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.
2
A new item response theory model for rater centrality using a hierarchical rater model approach.一种使用层次评分者模型方法的评分者中心度新的项目反应理论模型。
Behav Res Methods. 2022 Aug;54(4):1854-1868. doi: 10.3758/s13428-021-01699-y. Epub 2021 Nov 1.
3
Accuracy of performance-test linking based on a many-facet Rasch model.基于多方面 Rasch 模型的绩效测试链接准确性。
项目反应理论模型突出了客观结构化临床考试中等级量表的评分和评分者-等级量表的交互作用。
PLoS One. 2024 Sep 6;19(9):e0309887. doi: 10.1371/journal.pone.0309887. eCollection 2024.
4
Linking essay-writing tests using many-facet models and neural automated essay scoring.运用多维模型和神经自动作文评分技术对作文考试进行关联。
Behav Res Methods. 2024 Dec;56(8):8450-8479. doi: 10.3758/s13428-024-02485-2. Epub 2024 Aug 20.
Behav Res Methods. 2021 Aug;53(4):1440-1454. doi: 10.3758/s13428-020-01498-x. Epub 2020 Nov 9.
4
Using the Many-Facet Rasch Model to analyse and evaluate the quality of objective structured clinical examination: a non-experimental cross-sectional design.运用多面Rasch模型分析和评估客观结构化临床考试的质量:一项非实验性横断面设计。
BMJ Open. 2019 Sep 6;9(9):e029208. doi: 10.1136/bmjopen-2019-029208.
5
Exploring the Combined Effects of Rater Misfit and Differential Rater Functioning in Performance Assessments.探索评分者不匹配和评分者差异功能在绩效评估中的综合影响。
Educ Psychol Meas. 2019 Oct;79(5):962-987. doi: 10.1177/0013164419834613. Epub 2019 Apr 2.
6
On the complementarity of holistic and analytic approaches to performance assessment scoring.论整体论与分析论方法在表现评估评分中的互补性。
Br J Educ Psychol. 2019 Sep;89(3):468-484. doi: 10.1111/bjep.12286. Epub 2019 Apr 19.
7
Trifactor Models for Multiple-Ratings Data.三因子模型在多评分数据中的应用
Multivariate Behav Res. 2019 May-Jun;54(3):360-381. doi: 10.1080/00273171.2018.1530091. Epub 2019 Mar 28.
8
Rater Model Using Signal Detection Theory for Latent Differential Rater Functioning.基于信号检测理论的潜在评分者功能差异的评分者模型。
Multivariate Behav Res. 2019 Jul-Aug;54(4):492-504. doi: 10.1080/00273171.2018.1522496. Epub 2018 Dec 17.
9
Estimating Optimal Weights for Compound Scores: A Multidimensional IRT Approach.估计复合分数的最优权重:多维IRT 方法。
Multivariate Behav Res. 2018 Nov-Dec;53(6):914-924. doi: 10.1080/00273171.2018.1478712. Epub 2018 Nov 21.
10
Simple Structure Detection Through Bayesian Exploratory Multidimensional IRT Models.基于贝叶斯探索性多维IRT 模型的简单结构检测。
Multivariate Behav Res. 2019 Jan-Feb;54(1):100-112. doi: 10.1080/00273171.2018.1496317. Epub 2018 Nov 7.