测量筛查自动化对诊断性试验准确性的荟萃分析的影响。

Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy.

机构信息

LIMSI, CNRS, Université Paris Saclay, Rue du Belvedère, Orsay, 91405, France.

Amsterdam Public Health, Amsterdam UMC, University of Amsterdam, Meibergdreef 9, Amsterdam, 1105 AZ, the Netherlands.

出版信息

Syst Rev. 2019 Oct 28;8(1):243. doi: 10.1186/s13643-019-1162-x.

DOI:10.1186/s13643-019-1162-x

PMID:31661028

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6819363/

Abstract

BACKGROUND

The large and increasing number of new studies published each year is making literature identification in systematic reviews ever more time-consuming and costly. Technological assistance has been suggested as an alternative to the conventional, manual study identification to mitigate the cost, but previous literature has mainly evaluated methods in terms of recall (search sensitivity) and workload reduction. There is a need to also evaluate whether screening prioritization methods leads to the same results and conclusions as exhaustive manual screening. In this study, we examined the impact of one screening prioritization method based on active learning on sensitivity and specificity estimates in systematic reviews of diagnostic test accuracy.

METHODS

We simulated the screening process in 48 Cochrane reviews of diagnostic test accuracy and re-run 400 meta-analyses based on a least 3 studies. We compared screening prioritization (with technological assistance) and screening in randomized order (standard practice without technology assistance). We examined if the screening could have been stopped before identifying all relevant studies while still producing reliable summary estimates. For all meta-analyses, we also examined the relationship between the number of relevant studies and the reliability of the final estimates.

RESULTS

The main meta-analysis in each systematic review could have been performed after screening an average of 30% of the candidate articles (range 0.07 to 100%). No systematic review would have required screening more than 2308 studies, whereas manual screening would have required screening up to 43,363 studies. Despite an average 70% recall, the estimation error would have been 1.3% on average, compared to an average 2% estimation error expected when replicating summary estimate calculations.

CONCLUSION

Screening prioritization coupled with stopping criteria in diagnostic test accuracy reviews can reliably detect when the screening process has identified a sufficient number of studies to perform the main meta-analysis with an accuracy within pre-specified tolerance limits. However, many of the systematic reviews did not identify a sufficient number of studies that the meta-analyses were accurate within a 2% limit even with exhaustive manual screening, i.e., using current practice.

摘要

背景

每年发表的大量新研究使得系统评价中的文献识别越来越耗时和昂贵。有人建议采用技术辅助作为传统手动文献识别的替代方法来降低成本，但以前的文献主要评估了方法的召回率（搜索灵敏度）和工作量减少。还需要评估筛选优先级方法是否会导致与全面手动筛选相同的结果和结论。在这项研究中，我们检查了一种基于主动学习的筛选优先级方法对诊断测试准确性系统评价中敏感性和特异性估计的影响。

方法

我们模拟了 48 项 Cochrane 诊断测试准确性系统评价中的筛选过程，并基于至少 3 项研究重新运行了 400 项荟萃分析。我们比较了基于技术辅助的筛选优先级和随机筛选（无技术辅助的标准实践）。我们检查了是否可以在识别所有相关研究之前停止筛选，同时仍然可以产生可靠的汇总估计。对于所有荟萃分析，我们还检查了相关研究数量与最终估计可靠性之间的关系。

结果

每项系统评价的主要荟萃分析平均可以在筛选出候选文章的 30%（范围 0.07 至 100%）后进行。没有一项系统评价需要筛选超过 2308 项研究，而手动筛选可能需要筛选多达 43363 项研究。尽管平均召回率为 70%，但估计误差平均为 1.3%，而复制汇总估计计算时预计的平均误差为 2%。

结论

在诊断测试准确性评价中，结合停止标准的筛选优先级可以可靠地检测到筛选过程已经识别出足够数量的研究，以便在预先指定的容差范围内进行主要荟萃分析，并且具有准确性。然而，即使使用全面的手动筛选，许多系统评价也没有识别出足够数量的研究，使得荟萃分析在 2%的限制内具有准确性，即使用当前的实践。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/372e/6819363/746845d7993f/13643_2019_1162_Fig1_HTML.jpg

相似文献

Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy.测量筛查自动化对诊断性试验准确性的荟萃分析的影响。

Syst Rev. 2019 Oct 28;8(1):243. doi: 10.1186/s13643-019-1162-x.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

The future of Cochrane Neonatal.考克兰新生儿协作网的未来。

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

A methodological review of how heterogeneity has been examined in systematic reviews of diagnostic test accuracy.对诊断试验准确性系统评价中如何检验异质性的方法学综述。

Health Technol Assess. 2005 Mar;9(12):1-113, iii. doi: 10.3310/hta9120.

Enhancing recall in automated record screening: A resampling algorithm.增强自动化记录筛选中的召回率：一种重抽样算法。

Res Synth Methods. 2024 May;15(3):372-383. doi: 10.1002/jrsm.1690. Epub 2024 Jan 7.

Meta-analyses of diagnostic test accuracy could not be reproduced.无法重现诊断性测试准确性的荟萃分析。

J Clin Epidemiol. 2020 Nov;127:161-166. doi: 10.1016/j.jclinepi.2020.06.033. Epub 2020 Jul 15.

[Health technology assessment report: Computer-assisted Pap test for cervical cancer screening].[卫生技术评估报告：用于宫颈癌筛查的计算机辅助巴氏试验]

Epidemiol Prev. 2012 Sep-Oct;36(5 Suppl 3):e1-43.

Meta-epidemiologic analysis indicates that MEDLINE searches are sufficient for diagnostic test accuracy systematic reviews.元流行病学分析表明，MEDLINE 检索足以进行诊断测试准确性系统评价。

J Clin Epidemiol. 2014 Nov;67(11):1192-9. doi: 10.1016/j.jclinepi.2014.05.008. Epub 2014 Jul 2.

Evidence for differences in patterns of temporal trends in meta-analyses of diagnostic accuracy studies in the Cochrane database of systematic reviews.在 Cochrane 系统评价数据库中对诊断准确性研究进行荟萃分析的时间趋势模式差异的证据。

J Clin Epidemiol. 2024 Oct;174:111472. doi: 10.1016/j.jclinepi.2024.111472. Epub 2024 Jul 22.

引用本文的文献

A comparative study of screening performance between abstrackr and GPT models: Systematic review and contextual analysis.Abstrackr与GPT模型筛查性能的比较研究：系统评价与情境分析。

BMC Med Inform Decis Mak. 2025 Aug 7;25(1):293. doi: 10.1186/s12911-025-03138-w.

State of omics-based microbial diagnostics of CRC.基于组学的结直肠癌微生物诊断现状。

Gut Microbes. 2025 Dec;17(1):2526132. doi: 10.1080/19490976.2025.2526132. Epub 2025 Jul 2.

Climate change and the global redistribution of biodiversity: substantial variation in empirical support for expected range shifts.气候变化与生物多样性的全球重新分布：预期范围变化的实证支持存在显著差异。

Environ Evid. 2023 Apr 11;12(1):7. doi: 10.1186/s13750-023-00296-0.

Automation of systematic reviews of biomedical literature: a scoping review of studies indexed in PubMed.生物医学文献系统评价自动化：PubMed 索引研究的范围综述。

Syst Rev. 2024 Jul 8;13(1):174. doi: 10.1186/s13643-024-02592-3.

Machine Learning Methods for Systematic Reviews:: A Rapid Scoping Review.系统评价的机器学习方法：快速范围综述

Dela J Public Health. 2023 Nov 30;9(4):40-47. doi: 10.32481/djph.2023.11.008. eCollection 2023 Nov.

Natural Language Processing: from Bedside to Everywhere.自然语言处理：从床边到无处不在。

Yearb Med Inform. 2022 Aug;31(1):243-253. doi: 10.1055/s-0042-1742510. Epub 2022 Jun 2.

本文引用的文献

Data Extraction and Synthesis in Systematic Reviews of Diagnostic Test Accuracy: A Corpus for Automating and Evaluating the Process.诊断试验准确性系统评价中的数据提取与合成：一个用于自动化和评估该过程的语料库

AMIA Annu Symp Proc. 2018 Dec 5;2018:817-826. eCollection 2018.

Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error.机器学习算法在系统评价中的应用：减少动物研究临床前评价中的工作量和减少人为筛选错误。

Syst Rev. 2019 Jan 15;8(1):23. doi: 10.1186/s13643-019-0942-7.

Rapid reviews may produce different results to systematic reviews: a meta-epidemiological study.快速综述可能产生不同于系统综述的结果：一项meta 流行病学研究。

J Clin Epidemiol. 2019 May;109:30-41. doi: 10.1016/j.jclinepi.2018.12.015. Epub 2018 Dec 25.

Automatic screening using word embeddings achieved high sensitivity and workload reduction for updating living network meta-analyses.使用词嵌入进行自动筛选可提高更新实时网络荟萃分析的灵敏度并减少工作量。

J Clin Epidemiol. 2019 Apr;108:86-94. doi: 10.1016/j.jclinepi.2018.12.001. Epub 2018 Dec 7.

Prioritising references for systematic reviews with RobotAnalyst: A user study.使用 RobotAnalyst 对系统评价进行优先排序：一项用户研究。

Res Synth Methods. 2018 Sep;9(3):470-488. doi: 10.1002/jrsm.1311. Epub 2018 Jul 30.

Abbreviated literature searches were viable alternatives to comprehensive searches: a meta-epidemiological study.缩写文献检索是全面检索的可行替代方法：一项荟萃流行病学研究。

J Clin Epidemiol. 2018 Oct;102:1-11. doi: 10.1016/j.jclinepi.2018.05.022. Epub 2018 Jun 2.

Grey literature in systematic reviews: a cross-sectional study of the contribution of non-English reports, unpublished studies and dissertations to the results of meta-analyses in child-relevant reviews.系统评价中的灰色文献：一项关于非英文报告、未发表研究及学位论文对儿童相关评价中荟萃分析结果贡献的横断面研究

BMC Med Res Methodol. 2017 Apr 19;17(1):64. doi: 10.1186/s12874-017-0347-z.

Trial Sequential Analysis in systematic reviews with meta-analysis.系统评价与Meta分析中的序贯试验分析。

BMC Med Res Methodol. 2017 Mar 6;17(1):39. doi: 10.1186/s12874-017-0315-7.

Predicting data saturation in qualitative surveys with mathematical models from ecological research.用生态研究中的数学模型预测定性调查中的数据饱和。

J Clin Epidemiol. 2017 Feb;82:71-78.e2. doi: 10.1016/j.jclinepi.2016.10.001. Epub 2016 Oct 24.

SWIFT-Review: a text-mining workbench for systematic review.SWIFT-Review：一个用于系统评价的文本挖掘工作台。

Syst Rev. 2016 May 23;5:87. doi: 10.1186/s13643-016-0263-z.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

测量筛查自动化对诊断性试验准确性的荟萃分析的影响。

Measuring the impact of screening automation on meta-analyses of diagnostic test accuracy.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献