Feurer I D, Becker G J, Picus D, Ramirez E, Darcy M D, Hicks M E
Journal of Vascular and Interventional Radiology, Nashville, TN.
JAMA. 1994 Jul 13;272(2):98-100. doi: 10.1001/jama.272.2.98.
To measure the reliability and preliminary validity of a grading instrument for editors to evaluate the quality of peer reviews.
The consecutive sample design included 53 reviews of 23 manuscripts. Reviews were systematically assigned to interrater reliability (n = 41; power greater than 0.90 to detect a difference of greater than one point) and preliminary criterion-related validity (n = 12) subsamples. Content validity was closely examined.
Nonclinical.
Three graders evaluated reliability. One individual examined content validity and two editors tested preliminary criterion-related validity. INTERVENTION (INSTRUMENT)--Attributes reflecting two basic dimensions, review content and format, were identified and scored (values are possible points/percent contribution): timeliness, 3/21%; grade sheet, 1/7%; etiquette, 1/7%; sectional narratives, 3/21%; citations, 2/14%; narrative summary, 2/14%; and insights, 2/14%. A scoring guide was provided.
Statistical analyses used to test the interrater reliability of the total score included the intraclass correlation coefficient and analysis of variance with the expectation to uphold the null hypothesis. Kendall's coefficient of concordance was used to test preliminary criterion-related validity.
The intraclass correlation coefficient was .84 (P < .001) and a lack of difference between mean scores was demonstrated by analysis of variance (P = .46). Content validity was confirmed and preliminary criterion-related validity was indicated (Kendall's coefficient of concordance = .94, P = .038).
The instrument is reliable. Content validation has been completed, and further criterion-related validation is warranted.
评估一种供编辑用于评价同行评议质量的分级工具的信度和初步效度。
连续抽样设计纳入了对23篇手稿的53份评议。评议被系统地分配到组内相关系数(n = 41;检测大于1分差异的效能大于0.90)和初步的效标关联效度(n = 12)子样本。对内容效度进行了仔细检查。
非临床环境。
三名评分者评估信度。一人检查内容效度,两名编辑测试初步的效标关联效度。干预(工具)——确定并对反映两个基本维度(评议内容和格式)的属性进行评分(分值为可能的得分/百分比贡献):及时性,3/21%;评分表,1/7%;礼仪,1/7%;各部分叙述,3/21%;引用,2/14%;叙述性总结,2/14%;以及见解,2/14%。提供了一份评分指南。
用于检验总分组内相关系数的统计分析包括组内相关系数和方差分析,期望维持原假设。肯德尔和谐系数用于检验初步的效标关联效度。
组内相关系数为0.84(P < .001),方差分析显示平均得分之间无差异(P = .46)。内容效度得到确认,初步的效标关联效度得到提示(肯德尔和谐系数 = .94,P = .038)。
该工具可靠。内容效度验证已完成,进一步的效标关联效度验证是必要的。