Suppr超能文献

评估评分者和受试者特征对有序评分一致性测量的影响。

Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.

作者信息

Nelson Kerrie P, Mitani Aya A, Edwards Don

机构信息

Department of Biostatistics, Boston University, 801 Massachusetts Avenue, Boston, MA, 02118, U.S.A.

Department of Statistics, University of South Carolina, Columbia, SC, 29208, U.S.A.

出版信息

Stat Med. 2017 Sep 10;36(20):3181-3199. doi: 10.1002/sim.7323. Epub 2017 Jun 13.

Abstract

Widespread inconsistencies are commonly observed between physicians' ordinal classifications in screening tests results such as mammography. These discrepancies have motivated large-scale agreement studies where many raters contribute ratings. The primary goal of these studies is to identify factors related to physicians and patients' test results, which may lead to stronger consistency between raters' classifications. While ordered categorical scales are frequently used to classify screening test results, very few statistical approaches exist to model agreement between multiple raters. Here we develop a flexible and comprehensive approach to assess the influence of rater and subject characteristics on agreement between multiple raters' ordinal classifications in large-scale agreement studies. Our approach is based upon the class of generalized linear mixed models. Novel summary model-based measures are proposed to assess agreement between all, or a subgroup of raters, such as experienced physicians. Hypothesis tests are described to formally identify factors such as physicians' level of experience that play an important role in improving consistency of ratings between raters. We demonstrate how unique characteristics of individual raters can be assessed via conditional modes generated during the modeling process. Simulation studies are presented to demonstrate the performance of the proposed methods and summary measure of agreement. The methods are applied to a large-scale mammography agreement study to investigate the effects of rater and patient characteristics on the strength of agreement between radiologists. Copyright © 2017 John Wiley & Sons, Ltd.

摘要

在诸如乳房X光检查等筛查测试结果中,医生的序数分类之间普遍存在不一致性。这些差异促使了大规模的一致性研究,许多评估者参与评分。这些研究的主要目标是确定与医生和患者测试结果相关的因素,这可能会使评估者的分类之间具有更强的一致性。虽然有序分类量表经常用于对筛查测试结果进行分类,但很少有统计方法可用于对多个评估者之间的一致性进行建模。在此,我们开发了一种灵活且全面的方法,以评估评估者和受试者特征对大规模一致性研究中多个评估者序数分类之间一致性的影响。我们的方法基于广义线性混合模型类。提出了基于模型的新颖汇总度量,以评估所有评估者或一部分评估者(如经验丰富的医生)之间的一致性。描述了假设检验,以正式确定诸如医生经验水平等在提高评估者之间评分一致性方面起重要作用的因素。我们展示了如何通过建模过程中生成的条件模式来评估单个评估者的独特特征。进行了模拟研究,以证明所提出方法和一致性汇总度量的性能。这些方法应用于一项大规模乳房X光检查一致性研究,以调查评估者和患者特征对放射科医生之间一致性强度的影响。版权所有© 2017约翰威立父子有限公司。

相似文献

3
6
Measuring intrarater association between correlated ordinal ratings.测量相关等级评定的组内关联性。
Biom J. 2020 Nov;62(7):1687-1701. doi: 10.1002/bimj.201900177. Epub 2020 Jun 11.
8

引用本文的文献

本文引用的文献

8
Missing data methods in longitudinal studies: a review.纵向研究中的缺失数据方法:综述
Test (Madr). 2009 May 1;18(1):1-43. doi: 10.1007/s11749-009-0138-x.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验