Research Group Digital Ethics, Knowledge Center Learning and Innovation (LENI), Archimedes Institute, HU University of Applied Sciences Utrecht, Utrecht, the Netherlands.
Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands.
Syst Rev. 2024 Mar 1;13(1):81. doi: 10.1186/s13643-024-02502-7.
Active learning has become an increasingly popular method for screening large amounts of data in systematic reviews and meta-analyses. The active learning process continually improves its predictions on the remaining unlabeled records, with the goal of identifying all relevant records as early as possible. However, determining the optimal point at which to stop the active learning process is a challenge. The cost of additional labeling of records by the reviewer must be balanced against the cost of erroneous exclusions. This paper introduces the SAFE procedure, a practical and conservative set of stopping heuristics that offers a clear guideline for determining when to end the active learning process in screening software like ASReview. The eclectic mix of stopping heuristics helps to minimize the risk of missing relevant papers in the screening process. The proposed stopping heuristic balances the costs of continued screening with the risk of missing relevant records, providing a practical solution for reviewers to make informed decisions on when to stop screening. Although active learning can significantly enhance the quality and efficiency of screening, this method may be more applicable to certain types of datasets and problems. Ultimately, the decision to stop the active learning process depends on careful consideration of the trade-off between the costs of additional record labeling against the potential errors of the current model for the specific dataset and context.
主动学习已成为系统评价和荟萃分析中筛选大量数据的一种越来越受欢迎的方法。主动学习过程不断改进对剩余未标记记录的预测,目标是尽早识别所有相关记录。然而,确定停止主动学习过程的最佳点是一个挑战。审查员额外标记记录的成本必须与错误排除的成本相平衡。本文介绍了 SAFE 程序,这是一套实用且保守的停止启发式规则,为确定何时在 ASReview 等筛选软件中结束主动学习过程提供了明确的指导方针。停止启发式规则的折衷组合有助于最大限度地降低筛选过程中遗漏相关论文的风险。所提出的停止启发式规则平衡了继续筛选的成本与错过相关记录的风险,为审查员提供了一个实用的解决方案,以便在何时停止筛选方面做出明智的决策。尽管主动学习可以显著提高筛选的质量和效率,但这种方法可能更适用于某些类型的数据集和问题。最终,停止主动学习过程的决定取决于仔细考虑在特定数据集和上下文中,对额外记录标记的成本与当前模型的潜在错误进行权衡。