• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

随机对照试验文章的自动置信度分级分类:循证医学的辅助手段

Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine.

作者信息

Cohen Aaron M, Smalheiser Neil R, McDonagh Marian S, Yu Clement, Adams Clive E, Davis John M, Yu Philip S

机构信息

Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR 97239 USA

Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60612 USA.

出版信息

J Am Med Inform Assoc. 2015 May;22(3):707-17. doi: 10.1093/jamia/ocu025. Epub 2015 Feb 5.

DOI:10.1093/jamia/ocu025
PMID:25656516
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4457112/
Abstract

OBJECTIVE

For many literature review tasks, including systematic review (SR) and other aspects of evidence-based medicine, it is important to know whether an article describes a randomized controlled trial (RCT). Current manual annotation is not complete or flexible enough for the SR process. In this work, highly accurate machine learning predictive models were built that include confidence predictions of whether an article is an RCT.

MATERIALS AND METHODS

The LibSVM classifier was used with forward selection of potential feature sets on a large human-related subset of MEDLINE to create a classification model requiring only the citation, abstract, and MeSH terms for each article.

RESULTS

The model achieved an area under the receiver operating characteristic curve of 0.973 and mean squared error of 0.013 on the held out year 2011 data. Accurate confidence estimates were confirmed on a manually reviewed set of test articles. A second model not requiring MeSH terms was also created, and performs almost as well.

DISCUSSION

Both models accurately rank and predict article RCT confidence. Using the model and the manually reviewed samples, it is estimated that about 8000 (3%) additional RCTs can be identified in MEDLINE, and that 5% of articles tagged as RCTs in Medline may not be identified.

CONCLUSION

Retagging human-related studies with a continuously valued RCT confidence is potentially more useful for article ranking and review than a simple yes/no prediction. The automated RCT tagging tool should offer significant savings of time and effort during the process of writing SRs, and is a key component of a multistep text mining pipeline that we are building to streamline SR workflow. In addition, the model may be useful for identifying errors in MEDLINE publication types. The RCT confidence predictions described here have been made available to users as a web service with a user query form front end at: http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi.

摘要

目的

对于许多文献综述任务,包括系统综述(SR)以及循证医学的其他方面,了解一篇文章是否描述了随机对照试验(RCT)很重要。当前的手动标注对于SR过程而言不够完整或灵活。在这项研究中,构建了高度准确的机器学习预测模型,该模型包括关于一篇文章是否为RCT的置信度预测。

材料与方法

使用LibSVM分类器,并在MEDLINE中一个与人类相关的大型子集中对潜在特征集进行前向选择,以创建一个仅需每篇文章的引用、摘要和医学主题词(MeSH)的分类模型。

结果

该模型在2011年留出的数据上实现了受试者操作特征曲线下面积为0.973,均方误差为0.013。在一组经人工审核的测试文章上证实了准确的置信度估计。还创建了一个不需要MeSH词的第二个模型,其表现几乎同样出色。

讨论

两个模型都能准确地对文章的RCT置信度进行排名和预测。使用该模型和人工审核的样本估计,在MEDLINE中可额外识别出约8000篇(3%)RCT,并且Medline中标记为RCT的文章可能有5%未被识别。

结论

用连续值的RCT置信度对与人类相关的研究进行重新标注,对于文章排名和综述而言可能比简单的是/否预测更有用。自动化的RCT标注工具在撰写SR的过程中应能显著节省时间和精力,并且是我们正在构建的用于简化SR工作流程的多步骤文本挖掘管道的关键组成部分。此外,该模型可能有助于识别MEDLINE出版物类型中的错误。这里描述的RCT置信度预测已作为一项网络服务提供给用户,其前端有用户查询表单,网址为:http://arrowsmith.psych.uic.edu/cgi-bin/arrowsmith_uic/RCT_Tagger.cgi。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff4a/4457112/6181a8888b39/ocu025f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff4a/4457112/6181a8888b39/ocu025f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff4a/4457112/6181a8888b39/ocu025f1p.jpg

相似文献

1
Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine.随机对照试验文章的自动置信度分级分类:循证医学的辅助手段
J Am Med Inform Assoc. 2015 May;22(3):707-17. doi: 10.1093/jamia/ocu025. Epub 2015 Feb 5.
2
A quantitative model for linking two disparate sets of articles in MEDLINE.一种用于链接MEDLINE中两组不同文章的定量模型。
Bioinformatics. 2007 Jul 1;23(13):1658-65. doi: 10.1093/bioinformatics/btm161. Epub 2007 Apr 26.
3
A probabilistic automated tagger to identify human-related publications.一种用于识别与人相关出版物的概率自动标记器。
Database (Oxford). 2018 Jan 1;2018:1-8. doi: 10.1093/database/bay079.
4
Identifying reports of randomized controlled trials (RCTs) via a hybrid machine learning and crowdsourcing approach.通过机器学习与众包相结合的方法识别随机对照试验(RCT)报告。
J Am Med Inform Assoc. 2017 Nov 1;24(6):1165-1168. doi: 10.1093/jamia/ocx053.
5
Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide.机器学习在识别随机对照试验中的应用:评估与实践指南。
Res Synth Methods. 2018 Dec;9(4):602-614. doi: 10.1002/jrsm.1287. Epub 2018 Feb 7.
6
Evaluation of publication type tagging as a strategy to screen randomized controlled trial articles in preparing systematic reviews.评估将出版物类型标注作为在准备系统评价时筛选随机对照试验文章的一种策略。
JAMIA Open. 2022 Mar 30;5(1):ooac015. doi: 10.1093/jamiaopen/ooac015. eCollection 2022 Apr.
7
Wnt pathway curation using automated natural language processing: combining statistical methods with partial and full parse for knowledge extraction.使用自动自然语言处理技术对Wnt信号通路进行整理:结合统计方法与部分及完全句法分析进行知识提取。
Bioinformatics. 2005 Apr 15;21(8):1653-8. doi: 10.1093/bioinformatics/bti165. Epub 2004 Nov 25.
8
Automated information extraction of key trial design elements from clinical trial publications.从临床试验出版物中自动提取关键试验设计要素的信息。
AMIA Annu Symp Proc. 2008 Nov 6;2008:141-5.
9
Ranking the whole MEDLINE database according to a large training set using text indexing.使用文本索引根据一个大型训练集对整个MEDLINE数据库进行排名。
BMC Bioinformatics. 2005 Mar 24;6:75. doi: 10.1186/1471-2105-6-75.
10
Extracting drug-drug interaction articles from MEDLINE to improve the content of drug databases.从医学文献数据库(MEDLINE)中提取药物相互作用文章以改善药物数据库的内容。
AMIA Annu Symp Proc. 2005;2005:216-20.

引用本文的文献

1
Enhancing Automatic PT Tagging for MEDLINE Citations Using Transformer-Based Models.使用基于Transformer的模型增强MEDLINE引文的自动PT标注
ArXiv. 2025 Jun 3:arXiv:2506.03321v1.
2
Publication Type Tagging using Transformer Models and Multi-Label Classification.使用Transformer模型和多标签分类的出版物类型标记
AMIA Annu Symp Proc. 2025 May 22;2024:818-827. eCollection 2024.
3
Enhancing automated indexing of publication types and study designs in biomedical literature using full-text features.利用全文特征增强生物医学文献中出版物类型和研究设计的自动索引。

本文引用的文献

1
Design and implementation of Metta, a metasearch engine for biomedical literature retrieval intended for systematic reviewers.元搜索引擎 Metta 的设计与实现,旨在为系统评价者检索生物医学文献。
Health Inf Sci Syst. 2014 Jan 10;2:1. doi: 10.1186/2047-2501-2-1. eCollection 2014.
2
Aggregator: a machine learning approach to identifying MEDLINE articles that derive from the same underlying clinical trial.聚合器:一种用于识别源自同一基础临床试验的MEDLINE文章的机器学习方法。
Methods. 2015 Mar;74:65-70. doi: 10.1016/j.ymeth.2014.11.006. Epub 2014 Nov 20.
3
A large-scale analysis of the reasons given for excluding articles that are retrieved by literature search during systematic review.
medRxiv. 2025 Apr 28:2025.04.23.25326300. doi: 10.1101/2025.04.23.25326300.
4
Issues regarding the Indexing of Adaptive Clinical Trial Articles.适应性临床试验文章的索引问题。
medRxiv. 2025 Mar 11:2025.03.10.25323694. doi: 10.1101/2025.03.10.25323694.
5
Publication Type Tagging using Transformer Models and Multi-Label Classification.使用Transformer模型和多标签分类的出版物类型标注
medRxiv. 2025 Mar 7:2025.03.06.25323516. doi: 10.1101/2025.03.06.25323516.
6
COVID-19-related research data availability and quality according to the FAIR principles: A meta-research study.基于 FAIR 原则的 COVID-19 相关研究数据的可用性和质量:一项元研究。
PLoS One. 2024 Nov 18;19(11):e0313991. doi: 10.1371/journal.pone.0313991. eCollection 2024.
7
Automation of systematic reviews of biomedical literature: a scoping review of studies indexed in PubMed.生物医学文献系统评价自动化:PubMed 索引研究的范围综述。
Syst Rev. 2024 Jul 8;13(1):174. doi: 10.1186/s13643-024-02592-3.
8
How to optimize the systematic review process using AI tools.如何使用人工智能工具优化系统评价过程。
JCPP Adv. 2024 Apr 23;4(2):e12234. doi: 10.1002/jcv2.12234. eCollection 2024 Jun.
9
Insights into the nutritional prevention of macular degeneration based on a comparative topic modeling approach.基于比较主题建模方法对黄斑变性营养预防的见解。
PeerJ Comput Sci. 2024 Mar 20;10:e1940. doi: 10.7717/peerj-cs.1940. eCollection 2024.
10
Bat4RCT: A suite of benchmark data and baseline methods for text classification of randomized controlled trials.Bat4RCT:一组用于随机对照试验文本分类的基准数据和基线方法。
PLoS One. 2023 Mar 24;18(3):e0283342. doi: 10.1371/journal.pone.0283342. eCollection 2023.
对系统评价期间文献检索所获文章被排除原因的大规模分析。
AMIA Annu Symp Proc. 2013 Nov 16;2013:379-87. eCollection 2013.
4
Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence.医学证据系统评价者的特征工程和决策支持系统的提出。
PLoS One. 2014 Jan 27;9(1):e86277. doi: 10.1371/journal.pone.0086277. eCollection 2014.
5
Rule-based deduplication of article records from bibliographic databases.基于规则对书目数据库中的文章记录进行重复数据删除。
Database (Oxford). 2014 Jan 16;2014:bat086. doi: 10.1093/database/bat086. Print 2014.
6
Early versus delayed laparoscopic cholecystectomy for people with acute cholecystitis.急性胆囊炎患者早期与延迟腹腔镜胆囊切除术的比较
Cochrane Database Syst Rev. 2013 Jun 30(6):CD005440. doi: 10.1002/14651858.CD005440.pub3.
7
What is a rapid review? A methodological exploration of rapid reviews in Health Technology Assessments.什么是快速综述?卫生技术评估中快速综述方法学探索。
Int J Evid Based Healthc. 2012 Dec;10(4):397-410. doi: 10.1111/j.1744-1609.2012.00290.x.
8
MEDLINE clinical queries are robust when searching in recent publishing years.MEDLINE 临床检索在搜索近年出版的文献时非常强大。
J Am Med Inform Assoc. 2013 Mar-Apr;20(2):363-8. doi: 10.1136/amiajnl-2012-001075. Epub 2012 Sep 27.
9
Methods for the drug effectiveness review project.药物疗效评价项目的方法。
BMC Med Res Methodol. 2012 Sep 12;12:140. doi: 10.1186/1471-2288-12-140.
10
Beyond PICO: the SPIDER tool for qualitative evidence synthesis.超越 PICO:用于定性证据综合的 SPIDER 工具。
Qual Health Res. 2012 Oct;22(10):1435-43. doi: 10.1177/1049732312452938. Epub 2012 Jul 24.