Zimmerman John, Soler Robin E, Lavinder James, Murphy Sarah, Atkins Charisma, Hulbert LaShonda, Lusk Richard, Ng Boon Peng
Deloitte Consulting, LLP, 191 Peachtree Street, Atlanta, GA, 30303, USA.
Centers for Disease Control and Prevention, National Center for Chronic Disease Prevention and Health Promotion, Division of Diabetes Translation, 1600 Clifton Rd, Atlanta, GA, USA.
Syst Rev. 2021 Apr 2;10(1):97. doi: 10.1186/s13643-021-01640-6.
Systematic Reviews (SR), studies of studies, use a formal process to evaluate the quality of scientific literature and determine ensuing effectiveness from qualifying articles to establish consensus findings around a hypothesis. Their value is increasing as the conduct and publication of research and evaluation has expanded and the process of identifying key insights becomes more time consuming. Text analytics and machine learning (ML) techniques may help overcome this problem of scale while still maintaining the level of rigor expected of SRs.
In this article, we discuss an approach that uses existing examples of SRs to build and test a method for assisting the SR title and abstract pre-screening by reducing the initial pool of potential articles down to articles that meet inclusion criteria. Our approach differs from previous approaches to using ML as a SR tool in that it incorporates ML configurations guided by previously conducted SRs, and human confirmation on ML predictions of relevant articles during multiple iterative reviews on smaller tranches of citations. We applied the tailored method to a new SR review effort to validate performance.
The case study test of the approach proved a sensitivity (recall) in finding relevant articles during down selection that may rival many traditional processes and show ability to overcome most type II errors. The study achieved a sensitivity of 99.5% (213 out of 214) of total relevant articles while only conducting a human review of 31% of total articles available for review.
We believe this iterative method can help overcome bias in initial ML model training by having humans reinforce ML models with new and relevant information, and is an applied step towards transfer learning for ML in SR.
系统评价(SR)作为对研究的研究,采用正式流程来评估科学文献的质量,并从符合条件的文章中确定后续的有效性,以围绕一个假设得出共识性结果。随着研究与评价的开展和发表不断增加,以及识别关键见解的过程变得愈发耗时,其价值也日益凸显。文本分析和机器学习(ML)技术或许有助于克服这一规模问题,同时仍能维持系统评价所期望的严谨程度。
在本文中,我们讨论了一种方法,该方法利用现有的系统评价实例来构建和测试一种辅助系统评价标题和摘要预筛选的方法,即将潜在文章的初始库缩减至符合纳入标准的文章。我们的方法与以往将机器学习用作系统评价工具的方法不同,它纳入了以先前进行的系统评价为指导的机器学习配置,以及在对较小批次引文进行多次迭代评审期间,由人工对机器学习预测的相关文章进行确认。我们将定制方法应用于一项新开展的系统评价工作以验证其性能。
该方法的案例研究测试证明,在筛选过程中发现相关文章时具有敏感性(召回率),可与许多传统流程相媲美,并显示出克服大多数II型错误的能力。该研究在仅对31%的可评审文章进行人工评审的情况下,实现了对99.5%(214篇中的213篇)的全部相关文章的敏感性。
我们认为,这种迭代方法能够通过让人工利用新的相关信息强化机器学习模型,来帮助克服初始机器学习模型训练中的偏差,并且是迈向系统评价中机器学习迁移学习的一个应用步骤。