School of Computing, Robert Gordon University, Aberdeen, AB10 7GE, Scotland, UK.
Universidad El Bosque, Bogotá, Colombia.
Artif Intell Med. 2024 Nov;157:102989. doi: 10.1016/j.artmed.2024.102989. Epub 2024 Sep 26.
Systematic Review (SR) are foundational to influencing policies and decision-making in healthcare and beyond. SRs thoroughly synthesise primary research on a specific topic while maintaining reproducibility and transparency. However, the rigorous nature of SRs introduces two main challenges: significant time involved and the continuously growing literature, resulting in potential data omission, making most SRs become outmoded even before they are published. As a solution, AI techniques have been leveraged to simplify the SR process, especially the abstract screening phase. Active learning (AL) has emerged as a preferred method among these AI techniques, allowing interactive learning through human input. Several AL software have been proposed for abstract screening. Despite its prowess, how the various parameters involved in AL influence the software's efficacy is still unclear. This research seeks to demystify this by exploring how different AL strategies, such as initial training set, query strategies etc. impact SR automation. Experimental evaluations were conducted on five complex medical SR datasets, and the GLM model was used to interpret the findings statistically. Some AL variables, such as the feature extractor, initial training size, and classifiers, showed notable observations and practical conclusions were drawn within the context of SR and beyond where AL is deployed.
系统评价(SR)是影响医疗保健及其他领域政策和决策的基础。SR 系统地综合了特定主题的主要研究,同时保持可重复性和透明度。然而,SR 的严格性质带来了两个主要挑战:需要大量时间,以及不断增长的文献量,导致潜在的数据遗漏,使得大多数 SR 在发表之前就已经过时。作为一种解决方案,人工智能技术已被用于简化 SR 过程,特别是抽象筛选阶段。在这些人工智能技术中,主动学习(AL)已成为首选方法,允许通过人工输入进行交互式学习。已经提出了几种用于抽象筛选的 AL 软件。尽管它很强大,但 AL 中涉及的各种参数如何影响软件的效果仍然不清楚。本研究旨在通过探索不同的 AL 策略(如初始训练集、查询策略等)如何影响 SR 自动化来揭开这个谜团。在五个复杂的医学 SR 数据集上进行了实验评估,并使用 GLM 模型对发现进行了统计解释。一些 AL 变量,如特征提取器、初始训练大小和分类器,显示出了显著的观察结果,并在 SR 及其他部署 AL 的领域得出了实际的结论。