• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种使用层次评分者模型方法的评分者中心度新的项目反应理论模型。

A new item response theory model for rater centrality using a hierarchical rater model approach.

机构信息

Faculty of Education, The University of Hong Kong, Hong Kong, China.

Department of Special Education and Counselling, The Education University of Hong Kong, Hong Kong, China.

出版信息

Behav Res Methods. 2022 Aug;54(4):1854-1868. doi: 10.3758/s13428-021-01699-y. Epub 2021 Nov 1.

DOI:10.3758/s13428-021-01699-y
PMID:34725802
Abstract

Rater centrality, in which raters overuse middle scores for rating, is a common rater error which can affect test scores and subsequent decisions. Past studies on rater errors have focused on rater severity and inconsistency, neglecting rater centrality. This study proposes a new model within the hierarchical rater model framework to explicitly specify and directly estimate rater centrality in addition to rater severity and inconsistency. Simulations were conducted using the freeware JAGS to evaluate the parameter recovery of the new model and the consequences of ignoring rater centrality. The results revealed that the model had good parameter recovery with small bias, low root mean square errors, and high test score reliability, especially when a fully crossed linking design was used. Ignoring centrality yielded poor item difficulty estimates, person ability estimates, rater errors estimates, and underestimated reliability. We also showcase how the new model can be used, using an empirical example involving English essays in the Advanced Placement exam.

摘要

评分者中心化,即评分者过度使用中间分数进行评分,是一种常见的评分者误差,会影响考试分数和后续决策。过去关于评分者误差的研究主要集中在评分者严厉性和不一致性上,而忽略了评分者中心化。本研究在层级评分者模型框架内提出了一个新模型,除了评分者严厉性和不一致性之外,还可以明确指定和直接估计评分者中心化。使用免费软件 JAGS 进行模拟,以评估新模型的参数恢复情况以及忽略评分者中心化的后果。结果表明,该模型具有良好的参数恢复能力,偏差较小,均方根误差较低,测试分数可靠性较高,特别是在使用完全交叉链接设计时。忽略中心化会导致项目难度估计、个体能力估计、评分者误差估计不准确,并低估可靠性。我们还展示了如何使用新模型,使用涉及高级安置考试中英语作文的实证示例。

相似文献

1
A new item response theory model for rater centrality using a hierarchical rater model approach.一种使用层次评分者模型方法的评分者中心度新的项目反应理论模型。
Behav Res Methods. 2022 Aug;54(4):1854-1868. doi: 10.3758/s13428-021-01699-y. Epub 2021 Nov 1.
2
Cognitive Diagnostic Models for Rater Effects.评分者效应的认知诊断模型
Front Psychol. 2020 Mar 24;11:525. doi: 10.3389/fpsyg.2020.00525. eCollection 2020.
3
Human ratings take time: A hierarchical facets model for the joint analysis of ratings and rating times.人力评分需要时间:一种联合分析评分和评分时间的层次因素模型。
Behav Res Methods. 2024 Apr;56(4):3535-3547. doi: 10.3758/s13428-023-02259-2. Epub 2023 Nov 2.
4
Detecting Rater Effects under Rating Designs with Varying Levels of Missingness.在存在不同程度缺失值的评分设计下检测评分者效应。
J Appl Meas. 2018;19(3):243-257.
5
Comparison of Models and Indices for Detecting Rater Centrality.用于检测评分者中心性的模型与指标比较
J Appl Meas. 2015;16(3):228-41.
6
Using Repeated Ratings to Improve Measurement Precision in Incomplete Rating Designs.在不完全评分设计中使用重复评分提高测量精度
J Appl Meas. 2018;19(2):148-161.
7
A Bayesian hierarchical latent trait model for estimating rater bias and reliability in large-scale performance assessment.一种用于估计大规模绩效评估中评分者偏差和可靠性的贝叶斯层次潜在特质模型。
PLoS One. 2018 Apr 3;13(4):e0195297. doi: 10.1371/journal.pone.0195297. eCollection 2018.
8
A Hierarchical Rater Model for Longitudinal Data.层次评分者模型在纵向数据中的应用。
Multivariate Behav Res. 2017 Sep-Oct;52(5):576-592. doi: 10.1080/00273171.2017.1342202. Epub 2017 Aug 28.
9
Modeling rater diagnostic skills in binary classification processes.对二进制分类过程中的评分者诊断技能进行建模。
Stat Med. 2018 Feb 20;37(4):557-571. doi: 10.1002/sim.7530. Epub 2017 Nov 2.
10
Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study.使用个体拟合统计量检测评分者偏差:一项蒙特卡罗模拟研究。
Perspect Med Educ. 2018 Apr;7(2):83-92. doi: 10.1007/s40037-017-0391-8.

引用本文的文献

1
Linking essay-writing tests using many-facet models and neural automated essay scoring.运用多维模型和神经自动作文评分技术对作文考试进行关联。
Behav Res Methods. 2024 Dec;56(8):8450-8479. doi: 10.3758/s13428-024-02485-2. Epub 2024 Aug 20.
2
A Bayesian many-facet Rasch model with Markov modeling for rater severity drift.贝叶斯多项 RASCH 模型与马尔可夫建模用于评分者严重偏差。
Behav Res Methods. 2023 Oct;55(7):3910-3928. doi: 10.3758/s13428-022-01997-z. Epub 2022 Oct 25.

本文引用的文献

1
Examining the Impacts of Rater Effects in Performance Assessments.审视评分者效应在绩效评估中的影响。
Appl Psychol Meas. 2019 Mar;43(2):159-171. doi: 10.1177/0146621618789391. Epub 2018 Aug 5.
2
Detecting Rater Effects under Rating Designs with Varying Levels of Missingness.在存在不同程度缺失值的评分设计下检测评分者效应。
J Appl Meas. 2018;19(3):243-257.
3
Penalized loss functions for Bayesian model comparison.用于贝叶斯模型比较的惩罚损失函数。
Biostatistics. 2008 Jul;9(3):523-39. doi: 10.1093/biostatistics/kxm049. Epub 2008 Jan 21.
4
Detecting and measuring rater effects using many-facet Rasch measurement: Part II.使用多面Rasch测量法检测和衡量评分者效应:第二部分。
J Appl Meas. 2004;5(2):189-227.
5
Detecting and measuring rater effects using many-facet Rasch measurement: part I.使用多面Rasch测量法检测和衡量评分者效应:第一部分。
J Appl Meas. 2003;4(4):386-422.