• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评分者边际分布对其匹配一致性的影响:一种用于解释kappa的重新缩放框架。

The Effect of the Raters' Marginal Distributions on Their Matched Agreement: A Rescaling Framework for Interpreting Kappa.

作者信息

Karelitz Tzur M, Budescu David V

机构信息

a National Institute for Testing and Evaluation , Jerusalem , Israel.

b Department of Psychology , Fordham University.

出版信息

Multivariate Behav Res. 2013 Nov;48(6):923-52. doi: 10.1080/00273171.2013.830064.

DOI:10.1080/00273171.2013.830064
PMID:26745599
Abstract

Cohen's κ measures the improvement in classification above chance level and it is the most popular measure of interjudge agreement. Yet, there is considerable confusion about its interpretation. Specifically, researchers often ignore the fact that the observed level of matched agreement is bounded from above and below and the bounds are a function of the particular marginal distributions of the table. We propose that these bounds should be used to rescale the components of κ (observed and expected agreement). Rescaling κ in this manner results in κ', a measure that was originally proposed by Cohen (1960) and was largely ignored in both research and practice. This measure provides a common scale for agreement measures of tables with different marginal distributions. It reaches the maximal value of 1 when the judges show the highest level of agreement possible, given their marginal disagreements. We conclude that κ' should be used to measure the level of matched agreement contingent on a particular set of marginal distributions. The article provides a framework and a set of guidelines that facilitate comparisons between various types of agreement tables. We illustrate our points with simulations and real data from two studies-one involving judges' ratings of baseball players and one involving ratings of essays in high-stakes tests.

摘要

科恩 κ 系数衡量高于随机水平的分类改进情况,它是衡量评判者间一致性最常用的指标。然而,对其解释存在相当大的困惑。具体而言,研究人员常常忽略这样一个事实,即观察到的匹配一致性水平存在上下限,且这些界限是表格特定边际分布的函数。我们建议应使用这些界限对 κ 系数的组成部分(观察到的一致性和预期一致性)进行重新缩放。以这种方式重新缩放 κ 系数会得到 κ',这是科恩(1960 年)最初提出的一个指标,在研究和实践中基本都被忽视了。该指标为具有不同边际分布的表格的一致性度量提供了一个通用尺度。当评判者在存在边际分歧的情况下达到可能的最高一致水平时,它达到最大值 1。我们得出结论,κ' 应用于衡量特定一组边际分布条件下的匹配一致性水平。本文提供了一个框架和一套指导方针,便于对各种类型的一致性表格进行比较。我们通过模拟以及来自两项研究的真实数据来说明我们的观点,一项研究涉及对棒球运动员的评判者评分,另一项研究涉及高风险测试中作文的评分。

相似文献

1
The Effect of the Raters' Marginal Distributions on Their Matched Agreement: A Rescaling Framework for Interpreting Kappa.评分者边际分布对其匹配一致性的影响:一种用于解释kappa的重新缩放框架。
Multivariate Behav Res. 2013 Nov;48(6):923-52. doi: 10.1080/00273171.2013.830064.
2
Summary measures of agreement and association between many raters' ordinal classifications.多位评估者的有序分类之间一致性和关联性的汇总指标。
Ann Epidemiol. 2017 Oct;27(10):677-685.e4. doi: 10.1016/j.annepidem.2017.09.001. Epub 2017 Sep 22.
3
Clinicians are right not to like Cohen's κ.临床医生不喜欢 Cohen's κ 是对的。
BMJ. 2013 Apr 12;346:f2125. doi: 10.1136/bmj.f2125.
4
Measuring agreement for ordered ratings in 3 x 3 tables.测量3×3表格中有序评分的一致性。
Methods Inf Med. 2006;45(5):541-7.
5
[Quality criteria of assessment scales--Cohen's kappa as measure of interrator reliability (1)].评估量表的质量标准——作为评估者信度度量的科恩kappa系数(1)
Pflege. 2004 Feb;17(1):36-46. doi: 10.1024/1012-5302.17.1.36.
6
Delta: a new measure of agreement between two raters.德尔塔:评估两位评分者之间一致性的一种新方法。
Br J Math Stat Psychol. 2004 May;57(Pt 1):1-19. doi: 10.1348/000711004849268.
7
Pitfalls in the use of kappa when interpreting agreement between multiple raters in reliability studies.在可靠性研究中解释多个评分者之间的一致性时使用卡帕值的陷阱。
Physiotherapy. 2014 Mar;100(1):27-35. doi: 10.1016/j.physio.2013.08.002. Epub 2013 Nov 18.
8
Random marginal agreement coefficients: rethinking the adjustment for chance when measuring agreement.随机边际一致性系数:重新思考测量一致性时对机遇的调整。
Biostatistics. 2005 Jan;6(1):171-80. doi: 10.1093/biostatistics/kxh027.
9
Evaluating the effects of rater and subject factors on measures of association.评估评分者和受试者因素对关联度量的影响。
Biom J. 2018 May;60(3):639-656. doi: 10.1002/bimj.201700078. Epub 2018 Jan 19.
10
A sequential test for assessing observed agreement between raters.一种用于评估评分者间观察一致性的序贯检验。
Biom J. 2018 Jan;60(1):128-145. doi: 10.1002/bimj.201600239. Epub 2017 Sep 12.

引用本文的文献

1
Detecting Clusters/Communities in Social Networks.检测社交网络中的簇/社区。
Multivariate Behav Res. 2018 Jan-Feb;53(1):57-73. doi: 10.1080/00273171.2017.1391682. Epub 2017 Dec 8.
2
Analyzing Test-Taking Behavior: Decision Theory Meets Psychometric Theory.分析应试行为:决策理论与心理测量理论的结合
Psychometrika. 2015 Dec;80(4):1105-22. doi: 10.1007/s11336-014-9425-x. Epub 2014 Aug 21.