• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

检验相关一致性系数差异的统计学显著性。

Testing the Difference of Correlated Agreement Coefficients for Statistical Significance.

作者信息

Gwet Kilem L

机构信息

Advanced Analytics, LLC, Gaithersburg, MD, USA.

出版信息

Educ Psychol Meas. 2016 Aug;76(4):609-637. doi: 10.1177/0013164415596420. Epub 2015 Jul 28.

DOI:10.1177/0013164415596420
PMID:29795880
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5965565/
Abstract

This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling techniques. In this article, we propose a technique similar to the classical pairwise test for means, which is based on a large-sample linear approximation of the agreement coefficient. We illustrate the use of this technique with several known agreement coefficients including Cohen's kappa, Gwet's AC, Fleiss's generalized kappa, Conger's generalized kappa, Krippendorff's alpha, and the Brenann-Prediger coefficient. The proposed method is very flexible, can accommodate several types of correlation structures between coefficients, and requires neither advanced statistical modeling skills nor considerable computer programming experience. The validity of this method is tested with a Monte Carlo simulation.

摘要

本文探讨了检验两个相关一致性系数之间差异的统计学显著性问题。许多作者提出了检验两个相关kappa系数之间差异的方法,这些方法要么需要使用重采样方法,要么需要使用先进的统计建模技术。在本文中,我们提出了一种类似于经典均值成对检验的技术,该技术基于一致性系数的大样本线性近似。我们用几个已知的一致性系数说明了该技术的应用,包括科恩kappa系数、格韦特AC系数、弗莱斯广义kappa系数、康格广义kappa系数、克里彭多夫alpha系数和布伦南 - 普雷迪格系数。所提出的方法非常灵活,可以适应系数之间的几种相关结构,既不需要先进的统计建模技能,也不需要大量的计算机编程经验。通过蒙特卡罗模拟检验了该方法的有效性。

相似文献

1
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance.检验相关一致性系数差异的统计学显著性。
Educ Psychol Meas. 2016 Aug;76(4):609-637. doi: 10.1177/0013164415596420. Epub 2015 Jul 28.
2
A comparison of Cohen's Kappa and Gwet's AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples.科恩氏 κ系数与格瓦特氏 AC1 系数在计算评定者间信度系数时的比较:一项对人格障碍样本进行的研究。
BMC Med Res Methodol. 2013 Apr 29;13:61. doi: 10.1186/1471-2288-13-61.
3
Gwet's AC1 is not a substitute for Cohen's kappa - A comparison of basic properties.格韦特AC1不能替代科恩kappa系数——基本特性比较
MethodsX. 2023 May 10;10:102212. doi: 10.1016/j.mex.2023.102212. eCollection 2023.
4
Influence of true within-herd prevalence of small ruminant lentivirus infection in goats on agreement between serological immunoenzymatic tests.山羊小反刍兽慢病毒感染的真实群体内流行率对血清学免疫酶试验之间一致性的影响
Prev Vet Med. 2017 Sep 1;144:75-80. doi: 10.1016/j.prevetmed.2017.05.017. Epub 2017 May 30.
5
Homogeneity score test of AC statistics and estimation of common AC in multiple or stratified inter-rater agreement studies.多或分层组内一致性研究中 AC 统计量的同质性检验和共同 AC 的估计。
BMC Med Res Methodol. 2020 Feb 5;20(1):20. doi: 10.1186/s12874-019-0887-5.
6
Measures of Agreement with Multiple Raters: Fréchet Variances and Inference.多评分者一致性的度量:Fréchet 方差和推断。
Psychometrika. 2024 Jun;89(2):517-541. doi: 10.1007/s11336-023-09945-2. Epub 2024 Jan 8.
7
Inter-observer agreement between two observers for bovine digital dermatitis identification in New Zealand using digital photographs.新西兰两名观察者之间使用数码照片识别牛趾间皮炎的观察者间一致性。
N Z Vet J. 2019 May;67(3):143-147. doi: 10.1080/00480169.2019.1582369. Epub 2019 Mar 7.
8
A new coefficient of interrater agreement: The challenge of highly unequal category proportions.一种新的评分者间一致性系数:高度不均衡类目比例的挑战。
Psychol Methods. 2019 Aug;24(4):439-451. doi: 10.1037/met0000183. Epub 2018 May 3.
9
Interrater reliability estimators tested against true interrater reliabilities.评估者间信度估计值与真实评估者间信度进行比较。
BMC Med Res Methodol. 2022 Aug 29;22(1):232. doi: 10.1186/s12874-022-01707-5.
10
Hubert's multi-rater kappa revisited.再探休伯尔氏多评估者 κ 系数。
Br J Math Stat Psychol. 2020 Feb;73(1):1-22. doi: 10.1111/bmsp.12167. Epub 2019 May 6.

引用本文的文献

1
Content validation of a questionnaire on healthcare personnel's perceptions of technologies.一份关于医护人员对技术认知的调查问卷的内容效度验证
Rev Cuid. 2024 Dec 19;16(1):e4145. doi: 10.15649/cuidarte.4145. eCollection 2025 Jan-Apr.
2
Interobserver Agreement and Reliability of Intrapartum Nonreassuring Cardiotocography and Prediction of Neonatal Acidemia.产时胎心监护异常的观察者间一致性及可靠性与新生儿酸血症的预测
Matern Fetal Med. 2022 Apr 26;4(2):95-102. doi: 10.1097/FM9.0000000000000146. eCollection 2022 Apr.
3
Correlation of objective image quality metrics with radiologists' diagnostic confidence depends on the clinical task performed.客观图像质量指标与放射科医生诊断信心之间的相关性取决于所执行的临床任务。
J Med Imaging (Bellingham). 2025 Sep;12(5):051803. doi: 10.1117/1.JMI.12.5.051803. Epub 2025 Apr 11.
4
Extensor Pollicis Longus Entrapment on CT in Acute Distal Radius Fracture May Be a Predictor of Tendon Rupture.急性桡骨远端骨折中拇长伸肌腱在CT上的卡压可能是肌腱断裂的一个预测指标。
Hand (N Y). 2025 Feb 8:15589447251315748. doi: 10.1177/15589447251315748.
5
Role of three-dimensional computed tomography with humeral subtraction in assessing anteromedial facet coronoid fractures.肱骨减影三维计算机断层扫描在评估冠状突前内侧小关节骨折中的作用
JSES Int. 2024 Jul 20;9(1):332-338. doi: 10.1016/j.jseint.2024.07.003. eCollection 2025 Jan.
6
Diagnostic interobserver variability of atypia assessment in columnar cell lesions among a group of expert breast pathologists in the United Kingdom and the Republic of Ireland, on behalf of the UK national coordinating committee for breast pathology.代表英国乳腺病理学国家协调委员会,一组英国和爱尔兰共和国的专家乳腺病理学家对柱状细胞病变中异型性评估的诊断观察者间变异性。
Histopathology. 2025 May;86(6):953-966. doi: 10.1111/his.15402. Epub 2024 Dec 20.
7
Unveiling the Immediate Impact of Prechtl's General Movement Assessment Training on Inter-Rater Reliability and Cerebral Palsy Prediction.揭示普雷茨尔全身运动评估训练对评分者间信度和脑瘫预测的即时影响。
NeuroSci. 2024 Jul 23;5(3):244-253. doi: 10.3390/neurosci5030019. eCollection 2024 Sep.
8
Elevated Body Mass Index Is Associated With Rotator Cuff Disease: A Systematic Review and Meta-analysis.体重指数升高与肩袖疾病相关:一项系统评价和荟萃分析。
Arthrosc Sports Med Rehabil. 2024 May 31;6(4):100953. doi: 10.1016/j.asmr.2024.100953. eCollection 2024 Aug.
9
Head-to-Head Comparison of Tau and Amyloid Positron Emission Tomography Visual Reads for Differential Diagnosis of Neurodegenerative Disorders: An International, Multicenter Study.tau 和淀粉样蛋白正电子发射断层扫描视觉读片用于神经退行性疾病鉴别诊断的头对头比较:一项国际多中心研究。
Ann Neurol. 2024 Sep;96(3):476-487. doi: 10.1002/ana.27008. Epub 2024 Jun 18.
10
Pharyngeal Residues Scoring through the Yale Pharyngeal Residue Severity Rating Scale (YPRSRS): Efficacy of Training.通过耶鲁咽残留严重程度评定量表(YPRSRS)进行咽残留评分:训练效果
Dysphagia. 2025 Feb;40(1):271-281. doi: 10.1007/s00455-024-10725-y. Epub 2024 Jun 7.

本文引用的文献

1
A Ratio Test of Interrater Agreement With High Specificity.一种具有高特异性的评分者间一致性的比率检验。
Educ Psychol Meas. 2015 Dec;75(6):979-1001. doi: 10.1177/0013164415574086. Epub 2015 Mar 25.
2
The test-retest reliability of qualitative data.定性数据的重测信度。
Psychometrika. 1946 Jun;11:81-95. doi: 10.1007/BF02288925.
3
Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit.加权kappa系数:用于衡量名义尺度上的一致性,并考虑了尺度不一致或部分得分的情况。
Psychol Bull. 1968 Oct;70(4):213-20. doi: 10.1037/h0026256.
4
Computing inter-rater reliability and its variance in the presence of high agreement.在高度一致的情况下计算评分者间信度及其方差。
Br J Math Stat Psychol. 2008 May;61(Pt 1):29-48. doi: 10.1348/000711006X126600.
5
Modeling kappa for measuring dependent categorical agreement data.用于测量相关分类一致性数据的kappa建模。
Biostatistics. 2000 Jun;1(2):191-202. doi: 10.1093/biostatistics/1.2.191.
6
Weighted least-squares approach for comparing correlated kappa.用于比较相关kappa的加权最小二乘法。
Biometrics. 2002 Dec;58(4):1012-9. doi: 10.1111/j.0006-341x.2002.01012.x.
7
Comparing correlated kappas by resampling: is one level of agreement significantly different from another?通过重采样比较相关的卡帕值:一种一致性水平与另一种是否存在显著差异?
J Psychiatr Res. 1996 Nov-Dec;30(6):483-92. doi: 10.1016/s0022-3956(96)00033-7.
8
High agreement but low kappa: II. Resolving the paradoxes.高一致性但低卡帕值:II. 解决悖论
J Clin Epidemiol. 1990;43(6):551-8. doi: 10.1016/0895-4356(90)90159-m.
9
Estimating kappa from binocular data.从双眼数据估计卡帕值。
Stat Med. 1991 Aug;10(8):1303-11. doi: 10.1002/sim.4780100813.
10
Using replicate observations in observer agreement studies with binary assessments.在具有二元评估的观察者一致性研究中使用重复观测值。
Biometrics. 1991 Dec;47(4):1327-38.