Department of Epidemiology and Biostatistics and the EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, The Netherlands.
BMC Med Res Methodol. 2010 Sep 22;10:82. doi: 10.1186/1471-2288-10-82.
The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114).
75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item.
88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions.
Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved.
COSMIN 清单是评估健康相关患者报告结局测量属性研究方法质量的工具。本研究的目的是确定 COSMIN 清单(n=114)每个项目评分的组内一致性和可靠性。
从英国牛津患者报告结局测量组编制的文献数据库中随机选择 75 篇评估测量属性的文章。评估员被要求使用 COSMIN 清单评估三篇文章的方法学质量。在单向设计中,为每个项目计算百分比一致性和组内kappa 系数或二次加权 kappa 系数。
88 名评估员参与了研究。在 75 篇选定的文章中,有 26 篇文章由 4 至 6 名参与者评分,49 篇文章由 2 至 3 名参与者评分。总体而言,百分比一致性是适当的(80%以上的一致性达到 68%),COSMIN 项目的 kappa 系数较低(61%低于 0.40,6%高于 0.75)。组内一致性低的原因是需要主观判断,以及对不同的标准、术语和定义的习惯。
结果表明,评估员通常选择相同的反应选项,但在项目层面上很难区分文章。在系统评价中使用 COSMIN 清单时,我们建议接受一些培训和经验,由两名独立评估员完成,并就最终评分达成共识。清单的使用说明得到了改进。