Department for Evidence-based Medicine and Evaluation, Cochrane Austria, Danube University Krems, Krems, Austria; RTI-University of North Carolina Evidence-based Practice Center, RTI International, Research Triangle Park, NC, USA.
Department for Evidence-based Medicine and Evaluation, Cochrane Austria, Danube University Krems, Krems, Austria; Department of Family Medicine, Care and Public Health Research Institute, Maastricht University, Maastricht, The Netherlands.
J Clin Epidemiol. 2020 May;121:20-28. doi: 10.1016/j.jclinepi.2020.01.005. Epub 2020 Jan 21.
To determine the accuracy of single-reviewer screening in correctly classifying abstracts as relevant or irrelevant for literature reviews.
We conducted a crowd-based, parallel-group randomized controlled trial. Using the Cochrane Crowd platform, we randomly assigned eligible participants to 100 abstracts each of a pharmacological or a public health topic. After completing a training exercise, participants screened abstracts online based on predefined inclusion and exclusion criteria. We calculated sensitivities and specificities of single- and dual-reviewer screening using two published systematic reviews as reference standards.
Two hundred and eighty participants made 24,942 screening decisions on 2,000 randomly selected abstracts from the reference standard reviews. On average, each abstract was screened 12 times. Overall, single-reviewer abstract screening missed 13% of relevant studies (sensitivity: 86.6%; 95% confidence interval [CI], 80.6%-91.2%). By comparison, dual-reviewer abstract screening missed 3% of relevant studies (sensitivity: 97.5%; 95% CI, 95.1%-98.8%). The corresponding specificities were 79.2% (95% CI, 77.4%-80.9%) and 68.7% (95% CI, 66.4%-71.0%), respectively.
Single-reviewer abstract screening does not appear to fulfill the high methodological standards that decisionmakers expect from systematic reviews. It may be a viable option for rapid reviews, which deliberately lower methodological standards to provide decision makers with accelerated evidence synthesis products.
确定单 reviewer 筛选在正确分类文献综述相关或不相关摘要方面的准确性。
我们进行了一项基于人群的平行组随机对照试验。使用 Cochrane Crowd 平台,我们将符合条件的参与者随机分配到药理学或公共卫生主题的 100 篇摘要中。在完成一项培训练习后,参与者根据预设的纳入和排除标准在线筛选摘要。我们使用两篇已发表的系统评价作为参考标准,计算了单 reviewer 和双 reviewer 筛选的敏感性和特异性。
280 名参与者对来自参考标准评价的 2000 篇随机选择的摘要进行了 24942 次筛选决策。平均而言,每个摘要被筛选了 12 次。总体而言,单 reviewer 摘要筛选遗漏了 13%的相关研究(敏感性:86.6%;95%置信区间 [CI],80.6%-91.2%)。相比之下,双 reviewer 摘要筛选遗漏了 3%的相关研究(敏感性:97.5%;95% CI,95.1%-98.8%)。相应的特异性分别为 79.2%(95% CI,77.4%-80.9%)和 68.7%(95% CI,66.4%-71.0%)。
单 reviewer 摘要筛选似乎不符合决策者对系统评价的高方法学标准。对于快速审查,它可能是一种可行的选择,快速审查故意降低方法学标准,为决策者提供加速的证据综合产品。