Suppr超能文献

通过机器学习方法扩展 PubMed 检索以用于系统评价:ClinicalTrials.gov 的应用。

Extending PubMed searches to ClinicalTrials.gov through a machine learning approach for systematic reviews.

机构信息

Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic and Vascular Sciences, University of Padova, Via Loredan 18, Padova 35131, Italy.

Department of Biological Sciences and Bioengineering (BSBE), IIT, Kanpur, India.

出版信息

J Clin Epidemiol. 2018 Nov;103:22-30. doi: 10.1016/j.jclinepi.2018.06.015. Epub 2018 Jul 5.

Abstract

OBJECTIVES

Despite their essential role in collecting and organizing published medical literature, indexed search engines are unable to cover all relevant knowledge. Hence, current literature recommends the inclusion of clinical trial registries in systematic reviews (SRs). This study aims to provide an automated approach to extend a search on PubMed to the ClinicalTrials.gov database, relying on text mining and machine learning techniques.

STUDY DESIGN AND SETTING

The procedure starts from a literature search on PubMed. Next, it considers the training of a classifier that can identify documents with a comparable word characterization in the ClinicalTrials.gov clinical trial repository. Fourteen SRs, covering a broad range of health conditions, are used as case studies for external validation. A cross-validated support-vector machine (SVM) model was used as the classifier.

RESULTS

The sensitivity was 100% in all SRs except one (87.5%), and the specificity ranged from 97.2% to 99.9%. The ability of the instrument to distinguish on-topic from off-topic articles ranged from an area under the receiver operator characteristic curve of 93.4% to 99.9%.

CONCLUSION

The proposed machine learning instrument has the potential to help researchers identify relevant studies in the SR process by reducing workload, without losing sensitivity and at a small price in terms of specificity.

摘要

目的

尽管索引搜索引擎在收集和组织已发表医学文献方面发挥着重要作用,但它们无法涵盖所有相关知识。因此,目前的文献建议在系统评价(SR)中纳入临床试验注册库。本研究旨在提供一种自动化方法,通过文本挖掘和机器学习技术,将 PubMed 上的搜索扩展到 ClinicalTrials.gov 数据库。

研究设计和设置

该程序从 PubMed 上的文献搜索开始。接下来,它考虑训练一个分类器,该分类器可以识别 ClinicalTrials.gov 临床试验库中具有类似词汇特征的文档。14 项涵盖广泛健康状况的 SR 被用作外部验证的案例研究。交叉验证的支持向量机(SVM)模型被用作分类器。

结果

除了一项(87.5%)外,所有 SR 的灵敏度均为 100%,特异性范围为 97.2%至 99.9%。该工具区分主题文章和非主题文章的能力在受试者工作特征曲线下的面积为 93.4%至 99.9%。

结论

该机器学习工具具有帮助研究人员在 SR 过程中识别相关研究的潜力,通过减少工作量,而不会降低灵敏度,并以较小的特异性代价。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验