RTI International, 307 Waverly Oaks Road, Suite 101, Waltham, MA, 02452, USA.
RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, 27709, USA.
Syst Rev. 2020 Oct 19;9(1):243. doi: 10.1186/s13643-020-01450-2.
The exponential growth of the biomedical literature necessitates investigating strategies to reduce systematic reviewer burden while maintaining the high standards of systematic review validity and comprehensiveness.
We compared the traditional systematic review screening process with (1) a review-of-reviews (ROR) screening approach and (2) a semi-automation screening approach using two publicly available tools (RobotAnalyst and AbstrackR) and different types of training sets (randomly selected citations subjected to dual-review at the title-abstract stage, highly curated citations dually reviewed at the full-text stage, and a combination of the two). We evaluated performance measures of sensitivity, specificity, missed citations, and workload burden RESULTS: The ROR approach for treatments of early-stage prostate cancer had a poor sensitivity (0.54) and studies missed by the ROR approach tended to be of head-to-head comparisons of active treatments, observational studies, and outcomes of physical harms and quality of life. Title and abstract screening incorporating semi-automation only resulted in a sensitivity of 100% at high levels of reviewer burden (review of 99% of citations). A highly curated, smaller-sized, training set (n = 125) performed similarly to a larger training set of random citations (n = 938).
Two approaches to rapidly update SRs-review-of-reviews and semi-automation-failed to demonstrate reduced workload burden while maintaining an acceptable level of sensitivity. We suggest careful evaluation of the ROR approach through comparison of inclusion criteria and targeted searches to fill evidence gaps as well as further research of semi-automation use, including more study of highly curated training sets.
生物医学文献呈指数级增长,这就需要研究策略来减轻系统评价员的负担,同时保持系统评价有效性和全面性的高标准。
我们比较了传统的系统评价筛选过程,包括(1)综述的综述(ROR)筛选方法和(2)使用两种公开可用的工具(RobotAnalyst 和 AbstrackR)和不同类型的训练集(在标题-摘要阶段进行双审查的随机选择的引用,在全文阶段进行双审查的高度精选的引用,以及两者的组合)进行的半自动化筛选方法。我们评估了敏感性、特异性、漏检文献和工作量负担的性能指标。
针对早期前列腺癌治疗的 ROR 方法敏感性较差(0.54),并且 ROR 方法错过的研究往往是针对积极治疗、观察性研究以及身体伤害和生活质量结果的头对头比较。仅纳入半自动化的标题和摘要筛选在高 reviewer 负担水平下(审查 99%的文献)达到了 100%的敏感性。高度精选的、较小的训练集(n = 125)的性能与较大的随机引用训练集(n = 938)相似。
两种快速更新 SR 的方法——综述的综述和半自动化——未能在保持可接受敏感性水平的同时减轻工作量负担。我们建议通过比较纳入标准和有针对性的搜索来仔细评估 ROR 方法,以填补证据空白,并进一步研究半自动化的使用,包括对高度精选的训练集进行更多研究。