Hesselberg Jan-Ole, Dalsbø Therese K, Stromme Hilde, Svege Ida, Fretheim Atle
Department of Psychology, University of Oslo, Oslo, Norway.
Stiftelsen Dam, Oslo, Norway.
Cochrane Database Syst Rev. 2023 Nov 28;11(11):MR000056. doi: 10.1002/14651858.MR000056.pub2.
Funders and scientific journals use peer review to decide which projects to fund or articles to publish. Reviewer training is an intervention to improve the quality of peer review. However, studies on the effects of such training yield inconsistent results, and there are no up-to-date systematic reviews addressing this question.
To evaluate the effect of peer reviewer training on the quality of grant and journal peer review.
We used standard, extensive Cochrane search methods. The latest search date was 27 April 2022.
We included randomized controlled trials (RCTs; including cluster-RCTs) that evaluated peer review with training interventions versus usual processes, no training interventions, or other interventions to improve the quality of peer review.
We used standard Cochrane methods. Our primary outcomes were 1. completeness of reporting and 2. peer review detection of errors. Our secondary outcomes were 1. bibliometric scores, 2. stakeholders' assessment of peer review quality, 3. inter-reviewer agreement, 4. process-centred outcomes, 5. peer reviewer satisfaction, and 6. completion rate and speed of funded projects. We used the first version of the Cochrane risk of bias tool to assess the risk of bias, and we used GRADE to assess the certainty of evidence.
We included 10 RCTs with a total of 1213 units of analysis. The unit of analysis was the individual reviewer in seven studies (722 reviewers in total), and the reviewed manuscript in three studies (491 manuscripts in total). In eight RCTs, participants were journal peer reviewers. In two studies, the participants were grant peer reviewers. The training interventions can be broadly divided into dialogue-based interventions (interactive workshop, face-to-face training, mentoring) and one-way communication (written information, video course, checklist, written feedback). Most studies were small. We found moderate-certainty evidence that emails reminding peer reviewers to check items of reporting checklists, compared with standard journal practice, have little or no effect on the completeness of reporting, measured as the proportion of items (from 0.00 to 1.00) that were adequately reported (mean difference (MD) 0.02, 95% confidence interval (CI) -0.02 to 0.06; 2 RCTs, 421 manuscripts). There was low-certainty evidence that reviewer training, compared with standard journal practice, slightly improves peer reviewer ability to detect errors (MD 0.55, 95% CI 0.20 to 0.90; 1 RCT, 418 reviewers). We found low-certainty evidence that reviewer training, compared with standard journal practice, has little or no effect on stakeholders' assessment of review quality in journal peer review (standardized mean difference (SMD) 0.13 standard deviations (SDs), 95% CI -0.07 to 0.33; 1 RCT, 418 reviewers), or change in stakeholders' assessment of review quality in journal peer review (SMD -0.15 SDs, 95% CI -0.39 to 0.10; 5 RCTs, 258 reviewers). We found very low-certainty evidence that a video course, compared with no video course, has little or no effect on inter-reviewer agreement in grant peer review (MD 0.14 points, 95% CI -0.07 to 0.35; 1 RCT, 75 reviewers). There was low-certainty evidence that structured individual feedback on scoring, compared with general information on scoring, has little or no effect on the change in inter-reviewer agreement in grant peer review (MD 0.18 points, 95% CI -0.14 to 0.50; 1 RCT, 41 reviewers, low-certainty evidence).
AUTHORS' CONCLUSIONS: Evidence from 10 RCTs suggests that training peer reviewers may lead to little or no improvement in the quality of peer review. There is a need for studies with more participants and a broader spectrum of valid and reliable outcome measures. Studies evaluating stakeholders' assessments of the quality of peer review should ensure that these instruments have sufficient levels of validity and reliability.
资助者和科学期刊利用同行评审来决定资助哪些项目或发表哪些文章。评审员培训是一种旨在提高同行评审质量的干预措施。然而,关于此类培训效果的研究结果并不一致,且尚无针对该问题的最新系统评价。
评估同行评审员培训对资助项目和期刊同行评审质量的影响。
我们采用了标准的、全面的Cochrane检索方法。最新检索日期为2022年4月27日。
我们纳入了随机对照试验(RCTs;包括整群RCTs),这些试验评估了接受培训干预的同行评审与常规流程、无培训干预或其他旨在提高同行评审质量的干预措施相比的情况。
我们采用标准的Cochrane方法。我们的主要结局为:1. 报告的完整性;2. 同行评审对错误的检测。我们的次要结局为:1. 文献计量学得分;2. 利益相关者对同行评审质量的评估;3. 评审员间的一致性;4. 以过程为中心的结局;5. 同行评审员的满意度;6. 资助项目的完成率和速度。我们使用Cochrane偏倚风险工具的第一版来评估偏倚风险,并使用GRADE来评估证据的确定性。
我们纳入了10项RCTs,共1213个分析单位。在7项研究中,分析单位是个体评审员(共722名评审员),在3项研究中,分析单位是被评审的手稿(共491篇手稿)。在8项RCTs中,参与者是期刊同行评审员。在2项研究中,参与者是资助项目同行评审员。培训干预措施大致可分为基于对话的干预措施(互动研讨会、面对面培训、指导)和单向沟通(书面信息、视频课程、清单、书面反馈)。大多数研究规模较小。我们发现中等确定性的证据表明,与期刊标准做法相比,通过电子邮件提醒同行评审员检查报告清单项目,对报告完整性几乎没有影响,报告完整性以充分报告的项目比例(从0.00到1.00)衡量(平均差(MD)0.02,95%置信区间(CI)-0.02至0.06;2项RCTs,421篇手稿)。有低确定性的证据表明,与期刊标准做法相比,评审员培训可略微提高同行评审员检测错误的能力(MD 0.55,95%CI 0.20至0.90;1项RCT,418名评审员)。我们发现低确定性的证据表明,与期刊标准做法相比,评审员培训对利益相关者对期刊同行评审质量的评估几乎没有影响(标准化平均差(SMD)0.13标准差(SDs),95%CI -0.07至0.33;1项RCT,418名评审员),或对利益相关者对期刊同行评审质量评估的变化几乎没有影响(SMD -0.15 SDs,95%CI -0.39至0.10;5项RCTs,258名评审员)。我们发现极低确定性的证据表明,与无视频课程相比,视频课程对资助项目同行评审中评审员间的一致性几乎没有影响(MD 0.14分,95%CI -0.07至0.35;1项RCT,75名评审员)。有低确定性的证据表明,与评分的一般信息相比,结构化的个体评分反馈对资助项目同行评审中评审员间一致性的变化几乎没有影响(MD 0.18分,95%CI -0.14至0.50;1项RCT,41名评审员,低确定性证据)。
10项RCTs的证据表明,培训同行评审员可能对同行评审质量几乎没有改善。需要开展有更多参与者以及更广泛的有效和可靠结局指标的研究。评估利益相关者对同行评审质量评估的研究应确保这些工具具有足够的效度和信度水平。