• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Measuring rater bias in diagnostic tests with ordinal ratings.用等级评定测量诊断测试中的评分者偏倚。
Stat Med. 2021 Jul 30;40(17):4014-4033. doi: 10.1002/sim.9011. Epub 2021 May 9.
2
Measuring intrarater association between correlated ordinal ratings.测量相关等级评定的组内关联性。
Biom J. 2020 Nov;62(7):1687-1701. doi: 10.1002/bimj.201900177. Epub 2020 Jun 11.
3
Modeling rater diagnostic skills in binary classification processes.对二进制分类过程中的评分者诊断技能进行建模。
Stat Med. 2018 Feb 20;37(4):557-571. doi: 10.1002/sim.7530. Epub 2017 Nov 2.
4
Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.评估评分者和受试者特征对有序评分一致性测量的影响。
Stat Med. 2017 Sep 10;36(20):3181-3199. doi: 10.1002/sim.7323. Epub 2017 Jun 13.
5
Measures of agreement between many raters for ordinal classifications.多个评分者对有序分类的一致性度量。
Stat Med. 2015 Oct 15;34(23):3116-32. doi: 10.1002/sim.6546. Epub 2015 Jun 21.
6
Evaluating the effects of rater and subject factors on measures of association.评估评分者和受试者因素对关联度量的影响。
Biom J. 2018 May;60(3):639-656. doi: 10.1002/bimj.201700078. Epub 2018 Jan 19.
7
Improving the reliability of diagnostic tests in population-based agreement studies.提高基于人群的一致性研究中诊断试验的可靠性。
Stat Med. 2010 Mar 15;29(6):617-26. doi: 10.1002/sim.3819.
8
A paired kappa to compare binary ratings across two medical tests.比较两种医学检验结果的配对 Kappa 检验。
Stat Med. 2019 Jul 30;38(17):3272-3287. doi: 10.1002/sim.8200. Epub 2019 May 17.
9
Summary measures of agreement and association between many raters' ordinal classifications.多位评估者的有序分类之间一致性和关联性的汇总指标。
Ann Epidemiol. 2017 Oct;27(10):677-685.e4. doi: 10.1016/j.annepidem.2017.09.001. Epub 2017 Sep 22.
10
Quantifying rater variation for ordinal data using a rating scale model.使用评分量表模型对有序数据进行评分者变异的量化。
Stat Med. 2018 Jun 30;37(14):2223-2237. doi: 10.1002/sim.7639. Epub 2018 Apr 16.

本文引用的文献

1
BAYESIAN METHODS FOR MULTIPLE MEDIATORS: RELATING PRINCIPAL STRATIFICATION AND CAUSAL MEDIATION IN THE ANALYSIS OF POWER PLANT EMISSION CONTROLS.多中介变量的贝叶斯方法:在发电厂排放控制分析中关联主分层与因果中介
Ann Appl Stat. 2019 Sep;13(3):1927-1956. doi: 10.1214/19-AOAS1260. Epub 2019 Oct 17.
2
Modeling rater diagnostic skills in binary classification processes.对二进制分类过程中的评分者诊断技能进行建模。
Stat Med. 2018 Feb 20;37(4):557-571. doi: 10.1002/sim.7530. Epub 2017 Nov 2.
3
Measures of Diagnostic Accuracy: Basic Definitions.诊断准确性的测量:基本定义。
EJIFCC. 2009 Jan 20;19(4):203-11. eCollection 2009 Jan.
4
Estimating diagnostic accuracy without a gold standard: A continued controversy.在没有金标准的情况下评估诊断准确性:持续存在的争议。
J Biopharm Stat. 2016;26(6):1078-1082. doi: 10.1080/10543406.2016.1226334. Epub 2016 Aug 22.
5
Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.随机松弛,吉布斯分布,以及贝叶斯图像恢复。
IEEE Trans Pattern Anal Mach Intell. 1984 Jun;6(6):721-41. doi: 10.1109/tpami.1984.4767596.
6
Random effects models for assessing diagnostic accuracy of traditional Chinese doctors in absence of a gold standard.缺乏金标准时评估中医诊断准确性的随机效应模型。
Stat Med. 2012 Mar 30;31(7):661-71. doi: 10.1002/sim.4275. Epub 2011 May 31.
7
Identifiability of models for multiple diagnostic testing in the absence of a gold standard.在缺乏金标准的情况下多种诊断测试模型的可识别性
Biometrics. 2010 Sep;66(3):855-63. doi: 10.1111/j.1541-0420.2009.01330.x.
8
Weighted kappa for multiple raters.多位评分者的加权kappa系数。
Percept Mot Skills. 2008 Dec;107(3):837-48. doi: 10.2466/pms.107.3.837-848.
9
Systematic reviews of diagnostic test accuracy.诊断试验准确性的系统评价。
Ann Intern Med. 2008 Dec 16;149(12):889-97. doi: 10.7326/0003-4819-149-12-200812160-00008.
10
Random effects modeling approaches for estimating ROC curves from repeated ordinal tests without a gold standard.在没有金标准的情况下,通过重复有序检验估计ROC曲线的随机效应建模方法。
Biometrics. 2007 Jun;63(2):593-602. doi: 10.1111/j.1541-0420.2006.00712.x.

用等级评定测量诊断测试中的评分者偏倚。

Measuring rater bias in diagnostic tests with ordinal ratings.

机构信息

Department of Statistics, SungKyunKwan University, Jongno-gu, South Korea.

Department of Statistics, University of South Carolina, Columbia, South Carolina, USA.

出版信息

Stat Med. 2021 Jul 30;40(17):4014-4033. doi: 10.1002/sim.9011. Epub 2021 May 9.

DOI:10.1002/sim.9011
PMID:33969509
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8277718/
Abstract

Diagnostic tests are frequently reliant upon the interpretation of images by skilled raters. In many clinical settings, however, the variability observed between experts' ratings plays a detrimental role in the degree of confidence in these interpretations, leading to uncertainty in the diagnostic process. For example, in breast cancer testing, radiologists interpret mammographic images, while breast biopsy results are examined by pathologists. Each of these procedures involves elements of subjectivity. We propose here a flexible two-stage Bayesian latent variable model to investigate how the skills of individual raters impact the diagnostic accuracy of image-related testing in large-scale medical testing studies. A strength of the proposed model is that the true disease status of a patient within a reasonable time frame may or may not be known. In these studies, many raters each contribute classifications on a large sample of patients using a defined ordinal grading scale, leading to a complex correlation structure between ratings. Our modeling approach considers the different sources of variability contributed by experts and patients while accounting for correlations present between ratings and patients, in contrast to currently available methods. We propose a novel measure of a rater's ability (magnifier) that, in contrast to conventional measures of sensitivity and specificity, is robust to the underlying prevalence of disease in the population, providing an alternative measure of diagnostic accuracy across patient populations. Extensive simulation studies demonstrate lower bias in estimation of parameters and measures of accuracy, and illustrate outperformance of the proposed model when compared with existing models. Receiver operator characteristic curves are derived to assess the diagnostic accuracy of individual experts and their overall performance. Our proposed modeling approach is applied to a large breast imaging study for known disease status and a uterine cancer dataset for unknown disease status.

摘要

诊断测试通常依赖于熟练的评估者对图像的解释。然而,在许多临床环境中,专家评分之间的可变性在这些解释的置信度程度上起着有害的作用,导致诊断过程中的不确定性。例如,在乳腺癌检测中,放射科医生解释乳房 X 光图像,而病理学家则检查乳房活检结果。这些程序都涉及到主观性的元素。在这里,我们提出了一个灵活的两阶段贝叶斯潜在变量模型,以研究个体评估者的技能如何影响大规模医学测试研究中与图像相关的测试的诊断准确性。所提出模型的一个优点是,在合理的时间范围内,患者的真实疾病状态可能未知或已知。在这些研究中,许多评估者使用定义的有序分级量表对大量患者的分类进行分类,导致评分之间存在复杂的相关结构。我们的建模方法考虑了专家和患者贡献的不同来源的变异性,同时考虑了评分和患者之间存在的相关性,与当前可用的方法形成对比。我们提出了一种评估评估者能力的新度量标准(放大镜),与传统的敏感性和特异性度量标准相比,它对人群中疾病的潜在流行率具有鲁棒性,为跨患者群体提供了诊断准确性的替代度量标准。广泛的模拟研究表明,参数和准确性度量的估计偏差较低,并且与现有模型相比,提出的模型表现出更好的性能。绘制了接收器操作特征曲线,以评估单个专家的诊断准确性及其整体性能。我们提出的建模方法应用于具有已知疾病状态的大型乳房成像研究和具有未知疾病状态的子宫癌数据集。