• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

超越卡帕:二分类和多值有序分类评分诊断一致性的信息指标。

Beyond kappa: an informational index for diagnostic agreement in dichotomous and multivalue ordered-categorical ratings.

机构信息

Dipartimento di Matematica e Geoscienze, Università degli Studi di Trieste, Trieste, Italy.

Dipartimento di Area Medica, Istituto di Radiologia, Ospedale S. Maria della Misericordia, Università degli Studi di Udine, Udine, Italy.

出版信息

Med Biol Eng Comput. 2020 Dec;58(12):3089-3099. doi: 10.1007/s11517-020-02261-2. Epub 2020 Nov 3.

DOI:10.1007/s11517-020-02261-2
PMID:33145661
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7679268/
Abstract

Agreement measures are useful tools to both compare different evaluations of the same diagnostic outcomes and validate new rating systems or devices. Cohen's kappa (κ) certainly is the most popular agreement method between two raters, and proved its effectiveness in the last sixty years. In spite of that, this method suffers from some alleged issues, which have been highlighted since the 1970s; moreover, its value is strongly dependent on the prevalence of the disease in the considered sample. This work introduces a new agreement index, the informational agreement (IA), which seems to avoid some of Cohen's kappa's flaws, and separates the contribution of the prevalence from the nucleus of agreement. These goals are achieved by modelling the agreement-in both dichotomous and multivalue ordered-categorical cases-as the information shared between two raters through the virtual diagnostic channel connecting them: the more information exchanged between the raters, the higher their agreement. In order to test its fair behaviour and the effectiveness of the method, IA has been tested on some cases known to be problematic for κ, in the machine learning context and in a clinical scenario to compare ultrasound (US) and automated breast volume scanner (ABVS) in the setting of breast cancer imaging. Graphical Abstract To evaluate the agreement between the two raters [Formula: see text] and [Formula: see text] we create an agreement channel, based on Shannon Information Theory, that directly connects the random variables X and Y, that express the raters outcomes. They are the terminals of the chain X⇔ diagnostic test performed by [Formula: see text] ⇔ patient condition[Formula: see text] ⇔ diagnostic test performed by [Formula: see text] ⇔ Y.

摘要

一致性度量是比较同一诊断结果的不同评估以及验证新的评分系统或设备的有用工具。Cohen's kappa(κ)无疑是两位评估者之间最常用的一致性方法,并且在过去的六十年中已经证明了其有效性。尽管如此,自 20 世纪 70 年代以来,该方法一直存在一些被认为的问题;此外,其值强烈依赖于所考虑样本中疾病的流行程度。这项工作引入了一种新的一致性指标,信息一致性(IA),它似乎避免了 Cohen's kappa 的一些缺陷,并将疾病的流行率与一致性核心区分开来。通过对一致性的建模(无论是二分类还是多值有序分类),作为两个评估者通过连接他们的虚拟诊断通道共享的信息,实现了这些目标:评估者之间交换的信息越多,他们的一致性就越高。为了测试其公平性和方法的有效性,IA 在机器学习背景和临床场景中对一些已知对κ有问题的情况进行了测试,以比较乳腺癌成像中超声(US)和自动乳腺容积扫描仪(ABVS)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/e85634747fe1/11517_2020_2261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/c07eeded78bd/11517_2020_2261_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/8155efe6528b/11517_2020_2261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/240850ae5175/11517_2020_2261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/e85634747fe1/11517_2020_2261_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/c07eeded78bd/11517_2020_2261_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/8155efe6528b/11517_2020_2261_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/240850ae5175/11517_2020_2261_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7785/7679268/e85634747fe1/11517_2020_2261_Fig3_HTML.jpg

相似文献

1
Beyond kappa: an informational index for diagnostic agreement in dichotomous and multivalue ordered-categorical ratings.超越卡帕:二分类和多值有序分类评分诊断一致性的信息指标。
Med Biol Eng Comput. 2020 Dec;58(12):3089-3099. doi: 10.1007/s11517-020-02261-2. Epub 2020 Nov 3.
2
Agreement Between an Automated Volume Breast Scanner and Handheld Ultrasound for Diagnostic Breast Examinations.自动容积乳腺扫描仪与手持式超声在乳腺诊断检查中的一致性
J Ultrasound Med. 2017 Oct;36(10):2087-2092. doi: 10.1002/jum.14248. Epub 2017 Jun 1.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Measures of agreement between many raters for ordinal classifications.多个评分者对有序分类的一致性度量。
Stat Med. 2015 Oct 15;34(23):3116-32. doi: 10.1002/sim.6546. Epub 2015 Jun 21.
5
The prediction of pouch of Douglas obliteration using offline analysis of the transvaginal ultrasound 'sliding sign' technique: inter- and intra-observer reproducibility.经阴道超声“滑动征”技术的离线分析预测道格拉斯窝消失:观察者间和观察者内的可重复性。
Hum Reprod. 2013 May;28(5):1237-46. doi: 10.1093/humrep/det044. Epub 2013 Mar 12.
6
Clinicians are right not to like Cohen's κ.临床医生不喜欢 Cohen's κ 是对的。
BMJ. 2013 Apr 12;346:f2125. doi: 10.1136/bmj.f2125.
7
[Analyzing interrater agreement for categorical data using Cohen's kappa and alternative coefficients].[使用科恩kappa系数及其他系数分析分类数据的评分者间一致性]
Rehabilitation (Stuttg). 2007 Dec;46(6):370-7. doi: 10.1055/s-2007-976535.
8
Diagnostic performance and inter-observer concordance in lesion detection with the automated breast volume scanner (ABVS).自动乳腺容积扫描(ABVS)在病灶检测中的诊断性能和观察者间一致性。
BMC Med Imaging. 2013 Nov 12;13:36. doi: 10.1186/1471-2342-13-36.
9
Automated breast volume scanner (ABVS) in assessing breast cancer size: A comparison with conventional ultrasound and magnetic resonance imaging.自动乳腺容积扫描仪(ABVS)在评估乳腺癌大小方面的应用:与常规超声和磁共振成像的比较。
Eur Radiol. 2018 Mar;28(3):1000-1008. doi: 10.1007/s00330-017-5074-7. Epub 2017 Oct 10.
10
Interobserver reliability of automated breast volume scanner (ABVS) interpretation and agreement of ABVS findings with hand held breast ultrasound (HHUS), mammography and pathology results.自动乳腺容积扫描仪(ABVS)解读的观察者间可靠性以及与手持乳腺超声(HHUS)、乳腺钼靶摄影和病理结果的一致性。
Eur J Radiol. 2013 Aug;82(8):e332-6. doi: 10.1016/j.ejrad.2013.03.005. Epub 2013 Mar 27.

引用本文的文献

1
Is Personalized Mechanical Thrombectomy Based on Clot Characteristics Feasible? A Radiomics Model Using NCECT to Predict FPE in AIS Patients Undergoing Thromboaspiration.基于血栓特征的个性化机械取栓是否可行?一种使用非增强CT预测接受血栓抽吸的急性缺血性卒中患者功能预后良好的影像组学模型
J Clin Med. 2025 Jun 6;14(12):4027. doi: 10.3390/jcm14124027.
2
Role of hyaluronic acid in the treatment of peri-implant diseases: results of a meta-analysis.透明质酸在种植体周围疾病治疗中的作用:一项荟萃分析的结果
Front Oral Health. 2025 May 1;6:1564599. doi: 10.3389/froh.2025.1564599. eCollection 2025.
3
Evaluation of the LDBio ICT IgG/IgM lateral flow assay versus the Bordier Elisa assay for the diagnosis of chronic pulmonary aspergillosis in Nigeria.

本文引用的文献

1
Why Cohen's Kappa should be avoided as performance measure in classification.为什么科恩氏 Kappa 不应该被用作分类的性能度量?
PLoS One. 2019 Sep 26;14(9):e0222916. doi: 10.1371/journal.pone.0222916. eCollection 2019.
2
High Agreement and High Prevalence: The Paradox of Cohen's Kappa.高一致性与高患病率:科恩kappa系数的悖论
Open Nurs J. 2017 Oct 31;11:211-218. doi: 10.2174/1874434601711010211. eCollection 2017.
3
Comparison between automated breast volume scanner (ABVS) versus hand-held ultrasound as a second look procedure after magnetic resonance imaging.
在尼日利亚,对LDBio ICT IgG/IgM侧向流动分析法与Bordier酶联免疫吸附测定法用于诊断慢性肺曲霉病的评估。
Microbiol Spectr. 2025 Mar 4;13(3):e0153324. doi: 10.1128/spectrum.01533-24. Epub 2025 Feb 6.
4
Changes of the Alveolar Bone Ridge Using Bone Mineral Grafts and Collagen Membranes after Tooth Extraction: A Systematic Review and Meta-Analysis.拔牙后使用骨矿物质移植材料和胶原膜对牙槽嵴的影响:一项系统评价和荟萃分析
Bioengineering (Basel). 2024 Jun 3;11(6):565. doi: 10.3390/bioengineering11060565.
5
The role of probiotic therapy on clinical parameters and human immune response in peri-implant diseases: a systematic review and meta-analysis of randomized clinical studies.益生菌治疗对种植体周围疾病临床参数和人体免疫反应的作用:随机临床试验的系统评价和荟萃分析。
Front Immunol. 2024 Apr 15;15:1371072. doi: 10.3389/fimmu.2024.1371072. eCollection 2024.
6
Machine learning and new insights for breast cancer diagnosis.用于乳腺癌诊断的机器学习与新见解
J Int Med Res. 2024 Apr;52(4):3000605241237867. doi: 10.1177/03000605241237867.
7
Efficacy of music therapy on stress and anxiety prior to dental treatment: a systematic review and meta-analysis of randomized clinical trials.音乐疗法对牙科治疗前应激和焦虑的疗效:一项随机临床试验的系统评价和荟萃分析
Front Psychiatry. 2024 Feb 23;15:1352817. doi: 10.3389/fpsyt.2024.1352817. eCollection 2024.
8
Nomograms for predicting clinically significant prostate cancer in men with PI-RADS-3 biparametric magnetic resonance imaging.用于预测PI-RADS-3双参数磁共振成像男性患者临床显著前列腺癌的列线图。
Am J Cancer Res. 2024 Jan 15;14(1):73-85. doi: 10.62347/XBBI9870. eCollection 2024.
9
Analytical performance of free testosterone calculated by direct immunoluminometric method compared with the Vermeulen equation: results from a clinical series.直接免疫化学发光法计算游离睾酮的分析性能与 Vermeulen 方程的比较:临床系列研究结果。
Hormones (Athens). 2024 Jun;23(2):313-319. doi: 10.1007/s42000-023-00522-x. Epub 2024 Jan 5.
10
An information-oriented paradigm in evaluating accuracy and agreement in radiology.一种面向信息的医学影像学准确性和一致性评估范式。
Eur Radiol Exp. 2023 Mar 20;7(1):14. doi: 10.1186/s41747-023-00327-y.
自动乳腺容积扫描仪(ABVS)与手持超声在磁共振成像后作为二次检查方法的比较。
Eur Radiol. 2017 Sep;27(9):3767-3775. doi: 10.1007/s00330-017-4749-4. Epub 2017 Jan 24.
4
Assessing Binary Diagnoses of Bio-behavioral Disorders: The Clinical Relevance of Cohen's Kappa.评估生物行为障碍的二元诊断:科恩kappa系数的临床相关性。
J Nerv Ment Dis. 2017 Jan;205(1):58-65. doi: 10.1097/NMD.0000000000000598.
5
Measuring agreement between healthcare survey instruments using mutual information.使用互信息测量医疗保健调查工具之间的一致性。
BMC Med Inform Decis Mak. 2016 Jul 26;16:99. doi: 10.1186/s12911-016-0335-y.
6
Current status of automated breast ultrasonography.自动乳腺超声检查的现状。
Ultrasonography. 2015 Jul;34(3):165-72. doi: 10.14366/usg.15002. Epub 2015 Mar 23.
7
Informational analysis: a Shannon theoretic approach to measure the performance of a diagnostic test.信息分析:一种用于衡量诊断测试性能的香农理论方法。
Med Biol Eng Comput. 2015 Sep;53(9):899-910. doi: 10.1007/s11517-015-1294-7. Epub 2015 Apr 17.
8
Evaluation of diagnostic tests using information theory for multi-class diagnostic problems and its application for the detection of occlusal caries lesions.使用信息论评估多类别诊断问题的诊断测试及其在咬合面龋损检测中的应用。
Balkan Med J. 2014 Sep;31(3):214-8. doi: 10.5152/balkanmedj.2014.13218. Epub 2014 Sep 1.
9
Clinicians are right not to like Cohen's κ.临床医生不喜欢 Cohen's κ 是对的。
BMJ. 2013 Apr 12;346:f2125. doi: 10.1136/bmj.f2125.
10
Interrater reliability: the kappa statistic.组内一致性:kappa 统计量。
Biochem Med (Zagreb). 2012;22(3):276-82.