科学摘要的同行评审有多可靠？回顾1991年普通内科医学协会年会。

How reliable is peer review of scientific abstracts? Looking back at the 1991 Annual Meeting of the Society of General Internal Medicine.

作者信息

Rubin H R, Redelmeier D A, Wu A W, Steinberg E P

机构信息

Division of Internal Medicine, Johns Hopkins University, Baltimore, Maryland 21205.

出版信息

J Gen Intern Med. 1993 May;8(5):255-8. doi: 10.1007/BF02600092.

DOI:10.1007/BF02600092

PMID:8505684

Abstract

OBJECTIVE

To evaluate the interrater reproducibility of scientific abstract review.

DESIGN

Retrospective analysis.

SETTING

Review for the 1991 Society of General Internal Medicine (SGIM) annual meeting.

SUBJECTS

426 abstracts in seven topic categories evaluated by 55 reviewers.

MEASUREMENTS

Reviewers rated abstracts from 1 (poor) to 5 (excellent), globally and on three specific dimensions: interest to the SGIM audience, quality of methods, and quality of presentation. Each abstract was reviewed by five to seven reviewers. Each reviewer's ratings of the three dimensions were added to compute that reviewer's summary score for a given abstract. The mean of all reviewers' summary scores for an abstract, the final score, was used by SGIM to select abstracts for the meeting.

RESULTS

Final scores ranged from 4.6 to 13.6 (mean = 9.9). Although 222 abstracts (52%) were accepted for publication, the 95% confidence interval around the final score of 300 (70.4%) of the 426 abstracts overlapped with the threshold for acceptance of an abstract. Thus, these abstracts were potentially misclassified. Only 36% of the variance in summary scores was associated with an abstract's identity, 12% with the reviewer's identity, and the remainder with idiosyncratic reviews of abstracts. Global ratings were more reproducible than summary scores.

CONCLUSION

Reviewers disagreed substantially when evaluating the same abstracts. Future meeting organizers may wish to rank abstracts using global ratings, and to experiment with structured review criteria and other ways to improve raters' agreement.

摘要

目的

评估科学摘要评审中评分者间的可重复性。

设计

回顾性分析。

背景

对1991年普通内科医学协会（SGIM）年会的摘要进行评审。

研究对象

由55名评审员对七个主题类别的426篇摘要进行评估。

测量方法

评审员对摘要从1分（差）到5分（优）进行整体评分，并在三个特定维度上评分：对SGIM受众的吸引力、方法质量和展示质量。每篇摘要由五至七名评审员评审。将每位评审员在三个维度上的评分相加，计算出该评审员对某一给定摘要的总分。SGIM使用所有评审员对一篇摘要的总分平均值（最终得分）来选择会议摘要。

结果

最终得分范围为4.6至13.6（平均 = 9.9）。虽然222篇摘要（52%）被接受发表，但426篇摘要中300篇（70.4%）的最终得分的95%置信区间与摘要接受阈值重叠。因此，这些摘要可能被错误分类。总分方差中只有36%与摘要本身相关，12%与评审员身份相关，其余与摘要的特殊评审有关。整体评分比总分更具可重复性。

结论

评审员在评估相同摘要时存在很大分歧。未来的会议组织者可能希望使用整体评分对摘要进行排名，并尝试采用结构化评审标准和其他方法来提高评分者之间的一致性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

科学摘要的同行评审有多可靠？回顾1991年普通内科医学协会年会。

How reliable is peer review of scientific abstracts? Looking back at the 1991 Annual Meeting of the Society of General Internal Medicine.

作者信息

机构信息

出版信息

OBJECTIVE

DESIGN

SETTING

SUBJECTS

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

背景

研究对象

测量方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

科学摘要的同行评审有多可靠？回顾1991年普通内科医学协会年会。

How reliable is peer review of scientific abstracts? Looking back at the 1991 Annual Meeting of the Society of General Internal Medicine.

作者信息

机构信息

出版信息

OBJECTIVE

DESIGN

SETTING

SUBJECTS

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

背景

研究对象

测量方法

结果

结论

相似文献

引用本文的文献

本文引用的文献