生物医学文献的半自动系统评价筛选。

Semi-automated screening of biomedical citations for systematic reviews.

机构信息

Department of Computer Science, Tufts University, Medford, MA, USA.

出版信息

BMC Bioinformatics. 2010 Jan 26;11:55. doi: 10.1186/1471-2105-11-55.

DOI:10.1186/1471-2105-11-55

PMID:20102628

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2824679/

Abstract

BACKGROUND

Systematic reviews address a specific clinical question by unbiasedly assessing and analyzing the pertinent literature. Citation screening is a time-consuming and critical step in systematic reviews. Typically, reviewers must evaluate thousands of citations to identify articles eligible for a given review. We explore the application of machine learning techniques to semi-automate citation screening, thereby reducing the reviewers' workload.

RESULTS

We present a novel online classification strategy for citation screening to automatically discriminate "relevant" from "irrelevant" citations. We use an ensemble of Support Vector Machines (SVMs) built over different feature-spaces (e.g., abstract and title text), and trained interactively by the reviewer(s). Semi-automating the citation screening process is difficult because any such strategy must identify all citations eligible for the systematic review. This requirement is made harder still due to class imbalance; there are far fewer "relevant" than "irrelevant" citations for any given systematic review. To address these challenges we employ a custom active-learning strategy developed specifically for imbalanced datasets. Further, we introduce a novel undersampling technique. We provide experimental results over three real-world systematic review datasets, and demonstrate that our algorithm is able to reduce the number of citations that must be screened manually by nearly half in two of these, and by around 40% in the third, without excluding any of the citations eligible for the systematic review.

CONCLUSIONS

We have developed a semi-automated citation screening algorithm for systematic reviews that has the potential to substantially reduce the number of citations reviewers have to manually screen, without compromising the quality and comprehensiveness of the review.

摘要

背景

系统评价通过公正地评估和分析相关文献来解决特定的临床问题。引文筛选是系统评价中耗时且关键的步骤。通常，评审员必须评估数千条引文，以确定符合特定综述的文章。我们探讨了应用机器学习技术来半自动筛选引文，从而减轻评审员的工作量。

结果

我们提出了一种新颖的在线引文筛选分类策略，以自动区分“相关”和“不相关”的引文。我们使用基于不同特征空间（例如摘要和标题文本）的集成支持向量机（SVM），并由评审员进行交互式训练。半自动引文筛选过程很困难，因为任何这样的策略都必须识别出所有符合系统评价的引文。由于类不平衡，这一要求更加困难；对于任何给定的系统评价，“相关”引文比“不相关”引文少得多。为了应对这些挑战，我们采用了专门为不平衡数据集开发的自定义主动学习策略。此外，我们引入了一种新颖的欠采样技术。我们在三个真实的系统评价数据集上提供了实验结果，结果表明，在其中两个数据集上，我们的算法能够将必须手动筛选的引文数量减少近一半，而在第三个数据集上，减少约 40%，而不会排除任何符合系统评价的引文。

结论

我们开发了一种用于系统评价的半自动引文筛选算法，它有可能在不影响评价质量和全面性的情况下，大大减少评审员必须手动筛选的引文数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a61/2824679/e0a3bea3d709/1471-2105-11-55-1.jpg

相似文献

Semi-automated screening of biomedical citations for systematic reviews.

BMC Bioinformatics. 2010 Jan 26;11:55. doi: 10.1186/1471-2105-11-55.

Faster title and abstract screening? Evaluating Abstrackr, a semi-automated online screening program for systematic reviewers.

Syst Rev. 2015 Jun 15;4:80. doi: 10.1186/s13643-015-0067-6.

A semi-supervised approach using label propagation to support citation screening.

J Biomed Inform. 2017 Aug;72:67-76. doi: 10.1016/j.jbi.2017.06.018. Epub 2017 Jun 23.

Topic detection using paragraph vectors to support active learning in systematic reviews.

J Biomed Inform. 2016 Aug;62:59-65. doi: 10.1016/j.jbi.2016.06.001. Epub 2016 Jun 10.

Expediting citation screening using PICo-based title-only screening for identifying studies in scoping searches and rapid reviews.

Syst Rev. 2017 Nov 25;6(1):233. doi: 10.1186/s13643-017-0629-x.

Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers.

Artif Intell Med. 2012 Jul;55(3):197-207. doi: 10.1016/j.artmed.2012.05.002. Epub 2012 Jun 5.

Increased workload for systematic review literature searches of diagnostic tests compared with treatments: challenges and opportunities.

JMIR Med Inform. 2014 May 27;2(1):e11. doi: 10.2196/medinform.3037.

Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews.

Syst Rev. 2020 Dec 13;9(1):293. doi: 10.1186/s13643-020-01520-5.

A pilot validation study of crowdsourcing systematic reviews: update of a searchable database of pediatric clinical trials of high-dose vitamin D.

Transl Pediatr. 2017 Jan;6(1):18-26. doi: 10.21037/tp.2016.12.01.

Single screen of citations with excluded terms: an approach to citation screening in systematic reviews.

Syst Rev. 2018 Jul 28;7(1):111. doi: 10.1186/s13643-018-0782-x.

引用本文的文献

Artificial Intelligence and Automation in Evidence Synthesis: An Investigation of Methods Employed in Cochrane, Campbell Collaboration, and Environmental Evidence Reviews.

Cochrane Evid Synth Methods. 2025 Aug 28;3(5):e70046. doi: 10.1002/cesm.70046. eCollection 2025 Sep.

Exploring the Current Practices of Universities Regarding the Risk of Violence Towards Undergraduate Students on Clinical Placements: A Scoping Review.

Health Serv Insights. 2025 Aug 31;18:11786329251366383. doi: 10.1177/11786329251366383. eCollection 2025.

Accelerating clinical evidence synthesis with large language models.

NPJ Digit Med. 2025 Aug 8;8(1):509. doi: 10.1038/s41746-025-01840-7.

Assessing Geographic Diversity in Systematic Reviews: A 3D Interactive Approach Using Cochrane SRs in IPF.

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:290-299. eCollection 2025.

An active learning pipeline to automatically identify candidate terms for a CDSS ontology-measures, experiments, and performance.

medRxiv. 2025 Jun 9:2025.04.15.25325868. doi: 10.1101/2025.04.15.25325868.

Automated tools for systematic review screening methods: an application of machine learning for sexual orientation and gender identity measurement in health research.

J Med Libr Assoc. 2025 Jan 14;113(1):31-38. doi: 10.5195/jmla.2025.1860.

Efficient evidence selection for systematic reviews in traditional Chinese medicine.

BMC Med Res Methodol. 2025 Jan 15;25(1):10. doi: 10.1186/s12874-024-02430-z.

Impact of Treatment of Pudendal Neuralgia on Pain: A Systematic Review and Meta-Analysis.

Int Urogynecol J. 2025 Jan;36(1):35-58. doi: 10.1007/s00192-024-06004-x. Epub 2024 Nov 28.

Impact of Lifestyle Modifications on the Prevention and Treatment of Pelvic Organ Prolapse.

Int Urogynecol J. 2025 Jan;36(1):59-69. doi: 10.1007/s00192-024-05992-0. Epub 2024 Nov 19.

Boosting efficiency in a clinical literature surveillance system with LightGBM.

PLOS Digit Health. 2024 Sep 23;3(9):e0000299. doi: 10.1371/journal.pdig.0000299. eCollection 2024 Sep.

本文引用的文献

Systematic review: charged-particle radiation therapy for cancer.

Ann Intern Med. 2009 Oct 20;151(8):556-65. doi: 10.7326/0003-4819-151-8-200910200-00145. Epub 2009 Sep 14.

Statistical considerations in meta-analysis.

Infect Dis Clin North Am. 2009 Jun;23(2):195-210, Table of Contents. doi: 10.1016/j.idc.2009.01.003.

Reporting of systematic reviews of micronutrients and health: a critical appraisal.

Am J Clin Nutr. 2009 Apr;89(4):1099-113. doi: 10.3945/ajcn.2008.26821. Epub 2009 Feb 25.

Towards automatic recognition of scientifically rigorous clinical research evidence.

J Am Med Inform Assoc. 2009 Jan-Feb;16(1):25-31. doi: 10.1197/jamia.M2996. Epub 2008 Oct 24.

GAPscreener: an automatic tool for screening human genetic association literature in PubMed using the support vector machine technique.

BMC Bioinformatics. 2008 Apr 22;9:205. doi: 10.1186/1471-2105-9-205.

Frontiers of biomedical text mining: current progress.

Brief Bioinform. 2007 Sep;8(5):358-75. doi: 10.1093/bib/bbm045. Epub 2007 Oct 30.

Text categorization models for identifying unproven cancer treatments on the web.

Stud Health Technol Inform. 2007;129(Pt 2):968-72.

Automatic document classification of biological literature.

BMC Bioinformatics. 2006 Aug 7;7:370. doi: 10.1186/1471-2105-7-370.

The effect of feature representation on MEDLINE document classification.

AMIA Annu Symp Proc. 2005;2005:849-53.

Biomedical language processing: what's beyond PubMed?

Mol Cell. 2006 Mar 3;21(5):589-94. doi: 10.1016/j.molcel.2006.02.012.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生物医学文献的半自动系统评价筛选。

Semi-automated screening of biomedical citations for systematic reviews.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献