Hesselberg Jan-Ole, Fostervold Knut Inge, Ulleberg Pål, Svege Ida
Department of Psychology, University of Oslo, Oslo, Norway.
Faculty of Health Sciences, Oslo Metropolitan University, Oslo, Norway.
Res Integr Peer Rev. 2021 Sep 30;6(1):12. doi: 10.1186/s41073-021-00115-5.
Vast sums are distributed based on grant peer review, but studies show that interrater reliability is often low. In this study, we tested the effect of receiving two short individual feedback reports compared to one short general feedback report on the agreement between reviewers.
A total of 42 reviewers at the Norwegian Foundation Dam were randomly assigned to receive either a general feedback report or an individual feedback report. The general feedback group received one report before the start of the reviews that contained general information about the previous call in which the reviewers participated. In the individual feedback group, the reviewers received two reports, one before the review period (based on the previous call) and one during the period (based on the current call). In the individual feedback group, the reviewers were presented with detailed information on their scoring compared with the review committee as a whole, both before and during the review period. The main outcomes were the proportion of agreement in the eligibility assessment and the average difference in scores between pairs of reviewers assessing the same proposal. The outcomes were measured in 2017 and after the feedback was provided in 2018.
A total of 2398 paired reviews were included in the analysis. There was a significant difference between the two groups in the proportion of absolute agreement on whether the proposal was eligible for the funding programme, with the general feedback group demonstrating a higher rate of agreement. There was no difference between the two groups in terms of the average score difference. However, the agreement regarding the proposal score remained critically low for both groups.
We did not observe changes in proposal score agreement between 2017 and 2018 in reviewers receiving different feedback. The low levels of agreement remain a major concern in grant peer review, and research to identify contributing factors as well as the development and testing of interventions to increase agreement rates are still needed.
The study was preregistered at OSF.io/n4fq3 .
大量资金是基于资助同行评审进行分配的,但研究表明评审者间的信度往往较低。在本研究中,我们测试了与收到一份简短的通用反馈报告相比,收到两份简短的个人反馈报告对评审者间一致性的影响。
挪威大坝基金会的42名评审者被随机分配,分别接收通用反馈报告或个人反馈报告。通用反馈组在评审开始前收到一份报告,其中包含有关评审者参与的上一轮申请的一般信息。在个人反馈组中,评审者收到两份报告,一份在评审期开始前(基于上一轮申请),一份在评审期内(基于本轮申请)。在个人反馈组中,评审者在评审期前后都能看到与整个评审委员会相比自己评分的详细信息。主要结果是资格评估中的一致性比例以及评估同一提案的评审者对之间的平均分数差异。这些结果在2017年进行测量,并在2018年提供反馈后再次测量。
分析共纳入2398对评审。两组在提案是否符合资助计划的绝对一致性比例上存在显著差异,通用反馈组的一致性率更高。两组在平均分数差异方面没有差异。然而,两组关于提案分数的一致性仍然极低。
我们没有观察到2017年至2018年期间,接受不同反馈的评审者在提案分数一致性上有变化。低一致性水平仍然是资助同行评审中的一个主要问题,仍需要开展研究以确定影响因素,并开发和测试提高一致性率的干预措施。
该研究在OSF.io/n4fq3进行了预注册。