• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

同行评审的同行评审:一项随机对照试验及其他实验。

Peer reviews of peer reviews: A randomized controlled trial and other experiments.

作者信息

Goldberg Alexander, Stelmakh Ivan, Cho Kyunghyun, Oh Alice, Agarwal Alekh, Belgrave Danielle, Shah Nihar B

机构信息

School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America.

New Economic School, Moscow, Russia.

出版信息

PLoS One. 2025 Apr 2;20(4):e0320444. doi: 10.1371/journal.pone.0320444. eCollection 2025.

DOI:10.1371/journal.pone.0320444
PMID:40173178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11964232/
Abstract

Is it possible to reliably evaluate the quality of peer reviews? We study this question driven by two primary motivations - incentivizing high-quality reviewing using assessed quality of reviews and measuring changes to review quality in experiments. We conduct a large scale study at the NeurIPS 2022 conference, a top-tier conference in machine learning, in which we invited (meta)-reviewers and authors to voluntarily evaluate reviews given to submitted papers. First, we conduct a randomized controlled trial to examine bias due to the length of reviews. We generate elongated versions of reviews by adding substantial amounts of non-informative content. Participants in the control group evaluate the original reviews, whereas participants in the experimental group evaluate the artificially lengthened versions. We find that lengthened reviews are scored (statistically significantly) higher quality than the original reviews. Additionally, in analysis of observational data we find that authors are positively biased towards reviews recommending acceptance of their own papers, even after controlling for confounders of review length, quality, and different numbers of papers per author. We also measure disagreement rates between multiple evaluations of the same review of 28% - 32%, which is comparable to that of paper reviewers at NeurIPS. Further, we assess the amount of miscalibration of evaluators of reviews using a linear model of quality scores and find that it is similar to estimates of miscalibration of paper reviewers at NeurIPS. Finally, we estimate the amount of variability in subjective opinions around how to map individual criteria to overall scores of review quality and find that it is roughly the same as that in the review of papers. Our results suggest that the various problems that exist in reviews of papers - inconsistency, bias towards irrelevant factors, miscalibration, subjectivity - also arise in reviewing of reviews.

摘要

是否有可能可靠地评估同行评审的质量?我们研究这个问题主要有两个动机——利用评估出的评审质量激励高质量评审,以及衡量实验中评审质量的变化。我们在机器学习顶级会议NeurIPS 2022上进行了一项大规模研究,在该研究中我们邀请(元)评审人员和作者自愿评估提交论文所收到的评审意见。首先,我们进行了一项随机对照试验,以检验因评审长度而产生的偏差。我们通过添加大量无信息内容来生成评审意见的加长版本。对照组的参与者评估原始评审意见,而实验组的参与者评估人为加长的版本。我们发现加长后的评审意见得分(在统计上显著)高于原始评审意见。此外,在对观测数据的分析中,我们发现作者对推荐其论文被接受的评审意见存在正向偏差,即使在控制了评审长度、质量以及每位作者不同论文数量的混杂因素之后。我们还测量了对同一评审意见进行多次评估之间28% - 32%的分歧率,这与NeurIPS的论文评审人员的分歧率相当。此外,我们使用质量得分的线性模型评估评审意见评估者的校准错误量,发现其与NeurIPS论文评审者的校准错误估计相似。最后,我们估计了在如何将各个标准映射到评审质量总体得分方面主观意见的变异性,发现其与论文评审中的变异性大致相同。我们的结果表明,论文评审中存在的各种问题——不一致性、对无关因素的偏差、校准错误、主观性——在评审意见的评审中也会出现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/2a21137517bc/pone.0320444.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/5f649729f089/pone.0320444.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/3b924caa9f1e/pone.0320444.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/e71a598e9f22/pone.0320444.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/eb1dd032335d/pone.0320444.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/2a21137517bc/pone.0320444.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/5f649729f089/pone.0320444.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/3b924caa9f1e/pone.0320444.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/e71a598e9f22/pone.0320444.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/eb1dd032335d/pone.0320444.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a8b0/11964232/2a21137517bc/pone.0320444.g005.jpg

相似文献

1
Peer reviews of peer reviews: A randomized controlled trial and other experiments.同行评审的同行评审:一项随机对照试验及其他实验。
PLoS One. 2025 Apr 2;20(4):e0320444. doi: 10.1371/journal.pone.0320444. eCollection 2025.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
How do authors' perceptions of their papers compare with co-authors' perceptions and peer-review decisions?作者对其论文的看法与合著者的看法和同行评审决定相比如何?
PLoS One. 2024 Apr 10;19(4):e0300710. doi: 10.1371/journal.pone.0300710. eCollection 2024.
4
identifies gender disparities in scientific peer review.确定科学同行评审中的性别差距。
Elife. 2023 Nov 3;12:RP90230. doi: 10.7554/eLife.90230.
5
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
6
Statistical reviewers improve reporting in biomedical articles: a randomized trial.统计审查员提高生物医学文章的报告质量:一项随机试验。
PLoS One. 2007 Mar 28;2(3):e332. doi: 10.1371/journal.pone.0000332.
7
Effect on peer review of telling reviewers that their signed reviews might be posted on the web: randomised controlled trial.告知审稿人他们签署的审稿意见可能会被发布到网上对同行评审的影响:随机对照试验。
BMJ. 2010 Nov 16;341:c5729. doi: 10.1136/bmj.c5729.
8
Effect of open peer review on quality of reviews and on reviewers' recommendations: a randomised trial.公开同行评审对评审质量及评审者建议的影响:一项随机试验
BMJ. 1999 Jan 2;318(7175):23-7. doi: 10.1136/bmj.318.7175.23.
9
Differences in review quality and recommendations for publication between peer reviewers suggested by authors or by editors.作者推荐的同行评审员与编辑推荐的同行评审员之间在评审质量和出版建议方面存在差异。
JAMA. 2006 Jan 18;295(3):314-7. doi: 10.1001/jama.295.3.314.
10
Testing for reviewer anchoring in peer review: A randomized controlled trial.检测同行评审中的评审者锚定现象:一项随机对照试验。
PLoS One. 2024 Nov 18;19(11):e0301111. doi: 10.1371/journal.pone.0301111. eCollection 2024.

本文引用的文献

1
How do authors' perceptions of their papers compare with co-authors' perceptions and peer-review decisions?作者对其论文的看法与合著者的看法和同行评审决定相比如何?
PLoS One. 2024 Apr 10;19(4):e0300710. doi: 10.1371/journal.pone.0300710. eCollection 2024.
2
Tools used to assess the quality of peer review reports: a methodological systematic review.用于评估同行评审报告质量的工具:方法学系统评价。
BMC Med Res Methodol. 2019 Mar 6;19(1):48. doi: 10.1186/s12874-019-0688-x.
3
Peer review and competition in the Art Exhibition Game.艺术展览游戏中的同行评审与竞争。
Proc Natl Acad Sci U S A. 2016 Jul 26;113(30):8414-9. doi: 10.1073/pnas.1603723113. Epub 2016 Jul 11.
4
Retrospective analysis of the quality of reports by author-suggested and non-author-suggested reviewers in journals operating on open or single-blind peer review models.对采用开放或单盲同行评审模式的期刊中,由作者推荐和非作者推荐的审稿人所撰写报告的质量进行回顾性分析。
BMJ Open. 2015 Sep 29;5(9):e008707. doi: 10.1136/bmjopen-2015-008707.
5
Is Double-Blinded Peer Review Necessary? The Effect of Blinding on Review Quality.双盲同行评审是否必要?盲法对评审质量的影响。
Plast Reconstr Surg. 2015 Dec;136(6):1369-1377. doi: 10.1097/PRS.0000000000001820.
6
Effect on peer review of telling reviewers that their signed reviews might be posted on the web: randomised controlled trial.告知审稿人他们签署的审稿意见可能会被发布到网上对同行评审的影响:随机对照试验。
BMJ. 2010 Nov 16;341:c5729. doi: 10.1136/bmj.c5729.
7
Longitudinal trends in the performance of scientific peer reviewers.科学同行评审员表现的纵向趋势。
Ann Emerg Med. 2011 Feb;57(2):141-8. doi: 10.1016/j.annemergmed.2010.07.027. Epub 2010 Nov 12.
8
The relationship of previous training and experience of journal peer reviewers to subsequent review quality.期刊同行评审员之前的培训和经验与后续评审质量的关系。
PLoS Med. 2007 Jan;4(1):e40. doi: 10.1371/journal.pmed.0040040.
9
Effects of training on quality of peer review: randomised controlled trial.培训对同行评审质量的影响:随机对照试验
BMJ. 2004 Mar 20;328(7441):673. doi: 10.1136/bmj.38023.700775.AE. Epub 2004 Mar 2.
10
Author perception of peer review: impact of review quality and acceptance on satisfaction.作者对同行评审的看法:评审质量和录用情况对满意度的影响
JAMA. 2002 Jun 5;287(21):2790-3. doi: 10.1001/jama.287.21.2790.