Biomedical Informatics and Digital Health, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia.
Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.
Res Synth Methods. 2024 Jan;15(1):73-85. doi: 10.1002/jrsm.1672. Epub 2023 Sep 25.
Searching for trials is a key task in systematic reviews and a focus of automation. Previous approaches required knowing examples of relevant trials in advance, and most methods are focused on published trial articles. To complement existing tools, we compared methods for finding relevant trial registrations given a International Prospective Register of Systematic Reviews (PROSPERO) entry and where no relevant trials have been screened for inclusion in advance. We compared SciBERT-based (extension of Bidirectional Encoder Representations from Transformers) PICO extraction, MetaMap, and term-based representations using an imperfect dataset mined from 3632 PROSPERO entries connected to a subset of 65,662 trial registrations and 65,834 trial articles known to be included in systematic reviews. Performance was measured by the median rank and recall by rank of trials that were eventually included in the published systematic reviews. When ranking trial registrations relative to PROSPERO entries, 296 trial registrations needed to be screened to identify half of the relevant trials, and the best performing approach used a basic term-based representation. When ranking trial articles relative to PROSPERO entries, 162 trial articles needed to be screened to identify half of the relevant trials, and the best-performing approach used a term-based representation. The results show that MetaMap and term-based representations outperformed approaches that included PICO extraction for this use case. The results suggest that when starting with a PROSPERO entry and where no trials have been screened for inclusion, automated methods can reduce workload, but additional processes are still needed to efficiently identify trial registrations or trial articles that meet the inclusion criteria of a systematic review.
检索试验是系统评价的关键任务,也是自动化的重点。以前的方法需要提前了解相关试验的示例,并且大多数方法都集中在已发表的试验文章上。为了补充现有工具,我们比较了在给定国际前瞻性系统评价注册(PROSPERO)条目且没有事先筛选相关试验以纳入的情况下,找到相关试验注册的方法。我们比较了基于 SciBERT(来自 Transformer 的双向编码器表示的扩展)的 PICO 提取、MetaMap 和基于术语的表示,使用从与 6562 个 PROSPERO 条目相关的 65834 个试验注册和 65834 个试验文章的子集相关的 3632 个 PROSPERO 条目从一个不完善的数据集进行挖掘。性能通过最终纳入已发表系统评价的试验的中位数排名和按排名召回率来衡量。在相对于 PROSPERO 条目的试验注册排名中,需要筛选 296 个试验注册才能确定一半的相关试验,而表现最佳的方法使用了基本的基于术语的表示。在相对于 PROSPERO 条目的试验文章排名中,需要筛选 162 个试验文章才能确定一半的相关试验,而表现最佳的方法使用了基于术语的表示。结果表明,对于这种用例,MetaMap 和基于术语的表示优于包含 PICO 提取的方法。结果表明,当从 PROSPERO 条目开始且没有筛选试验以纳入时,自动化方法可以减少工作量,但仍需要额外的过程来有效地识别符合系统评价纳入标准的试验注册或试验文章。