Schmidt Andrew H, Zhao Guofen, Turkelson Charles
Department of Orthopedic Surgery, Mailcode G2, Hennepin County Medical Center, 701 Park Avenue, Minneapolis, MN 55415, USA.
J Bone Joint Surg Am. 2009 Apr;91(4):867-73. doi: 10.2106/JBJS.G.01233.
A hierarchy of levels of evidence is commonly used to categorize the methodology of scientific studies in order to assist in their critical analysis. Organizers of large scientific meetings are faced with the problem of whether and how to assign levels of evidence to studies that are presented. The present study was performed to investigate two hypotheses: (1) that session moderators and others can consistently assign a level of evidence to papers presented at national meetings, and (2) that there is no difference between the level of evidence provided by the author of a paper and the level of evidence assigned by independent third parties (e.g., members of the Program Committee).
A subset of papers accepted for presentation at the 2007 American Academy of Orthopaedic Surgeons (AAOS) Annual Meeting was used to evaluate differences in the levels of evidence assigned by the authors, volunteer graders who had access to only the abstract, and session moderators who had access to the full paper. The approved AAOS levels of evidence were used. Statistical tests of interrater correlation were done to compare the various raters to each other, with significance appropriately adjusted for multiple comparisons.
Interrater agreement was better than chance for most comparisons between different graders; however, the level of agreement ranged from slight to moderate (kappa=0.16 to 0.46), a finding confirmed by agreement coefficient statistics. In general, raters had difficulty in agreeing whether a study comprised Level-I or Level-II evidence and authors graded the level of evidence of their own work more favorably than did others who graded the abstract.
When abstracts submitted to the AAOS Annual Meeting were rated, there was substantial inconsistency in the assignments of the level of evidence to a given study by different observers and there was some evidence that authors may not rate their own work the same as independent reviewers. This has important implications for the use of levels of evidence in scientific meetings.
证据等级体系通常用于对科学研究方法进行分类,以辅助批判性分析。大型科学会议的组织者面临着是否以及如何为所展示的研究赋予证据等级的问题。本研究旨在调查两个假设:(1)会议主持人和其他人能否始终如一地为在全国性会议上展示的论文赋予证据等级;(2)论文作者提供的证据等级与独立第三方(如程序委员会成员)赋予的证据等级之间没有差异。
选取2007年美国骨科医师学会(AAOS)年会接受展示的部分论文,用于评估作者、仅能获取摘要的志愿者评分者以及能获取全文的会议主持人所赋予的证据等级差异。采用AAOS批准的证据等级。进行评分者间相关性的统计检验,以相互比较不同评分者,对多重比较的显著性进行适当调整。
在不同评分者之间的大多数比较中,评分者间的一致性优于随机水平;然而,一致程度从轻微到中等不等(kappa值=0.16至0.46),这一结果得到一致性系数统计的证实。总体而言,评分者在确定一项研究是属于I级还是II级证据方面存在困难,并且作者对自己作品的证据等级评分比其他对摘要进行评分的人更为有利。
当对提交给AAOS年会的摘要进行评分时,不同观察者对给定研究的证据等级赋值存在很大不一致,并且有证据表明作者对自己作品的评分可能与独立评审者不同。这对科学会议中证据等级的使用具有重要意义。