Suppr超能文献

- 型系数用于临床一致性的稳健性。

Robustness of -type coefficients for clinical agreement.

机构信息

Department of Industrial Engineering, University of Naples "Federico II", Naples, Italy.

出版信息

Stat Med. 2022 May 20;41(11):1986-2004. doi: 10.1002/sim.9341. Epub 2022 Feb 6.

Abstract

The degree of inter-rater agreement is usually assessed through -type coefficients and the extent of agreement is then characterized by comparing the value of the adopted coefficient against a benchmark scale. Through two motivating examples, it is displayed the different behavior of some -type coefficients due to asymmetric distribution of marginal frequencies over categories. In order to investigate the robustness of four -type coefficients for nominal and ordinal classifications and of an inferential benchmarking procedure that, differently from straightforward benchmarking, does not neglect the influence of the experimental conditions, an extensive Monte Carlo simulation study has been conducted. The robustness has been investigated for several scenarios, differing for sample size, rating scale dimension, number of raters, frequency distribution of rater classifications, pattern of agreement across raters. Simulation results reveal an higher paradoxical behavior of Fleiss kappa and Conger kappa with ordinal rather than nominal classifications; the coefficients robustness improves with increasing sample size and number of raters for both nominal and ordinal classifications whereas robustness improves with rating scale dimension only for nominal classifications. By identifying the scenarios (ie, minimum sample size, number of raters, rating scale dimension) with acceptable robustness, this study provides guidelines about the design of robust agreement studies.

摘要

组内一致性程度通常通过κ型系数进行评估,然后通过将采用的系数与基准尺度进行比较来描述一致性程度。通过两个有启发性的例子,展示了由于边际频率在类别上的不对称分布,一些κ型系数的不同行为。为了研究对于名义和有序分类的四个κ型系数以及一种推理基准程序的稳健性,该程序与直接基准不同,不会忽略实验条件的影响,进行了广泛的蒙特卡罗模拟研究。对于不同的样本量、评分尺度维度、评分者数量、评分者分类的频率分布、评分者之间的一致性模式等场景,研究了稳健性。模拟结果表明,Fleiss kappa 和 Conger kappa 对于有序分类而不是名义分类的悖论行为更高;对于名义和有序分类,随着样本量和评分者数量的增加,稳健性会提高,而对于名义分类,稳健性仅随着评分尺度维度的增加而提高。通过确定具有可接受稳健性的场景(即最小样本量、评分者数量、评分尺度维度),本研究为稳健一致性研究的设计提供了指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6587/9303799/d0e775fb29cf/SIM-41-1986-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验