Suppr超能文献

将癌症临床试验与其结果出版物相联系。

Linking Cancer Clinical Trials to their Result Publications.

作者信息

Pan Evan, Roberts Kirk

机构信息

Department of Computer Science & Engineering, Texas A&M University, College Station, TX, USA.

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:642-651. eCollection 2024.

Abstract

The results of clinical trials are a valuable source of evidence for researchers, policy makers, and healthcare professionals. However, online trial registries do not always contain links to the publications that report on their results, instead requiring a time-consuming manual search. Here, we explored the application of pre-trained transformer-based language models to automatically identify result-reporting publications of cancer clinical trials by computing dense vectors and performing semantic search. Models were fine-tuned on text data from trial registry fields and article metadata using a contrastive learning approach. The best performing model was PubMedBERT, which achieved a mean average precision of 0.592 and ranked 70.3% of a trial's publications in the top 5 results when tested on the holdout test trials. Our results suggest that semantic search using embeddings from transformer models may be an effective approach to the task of linking trials to their publications.

摘要

临床试验结果是研究人员、政策制定者和医疗保健专业人员的重要证据来源。然而,在线试验注册库并不总是包含指向报告其结果的出版物的链接,而是需要耗时的手动搜索。在此,我们探索了基于预训练变压器的语言模型的应用,通过计算密集向量和执行语义搜索来自动识别癌症临床试验的结果报告出版物。使用对比学习方法在试验注册库字段和文章元数据的文本数据上对模型进行微调。表现最佳的模型是PubMedBERT,在保留测试试验上进行测试时,其平均平均精度达到0.592,并且在试验的出版物中,有70.3%的出版物在前5个结果中排名。我们的结果表明,使用来自变压器模型的嵌入进行语义搜索可能是将试验与其出版物链接起来的任务的有效方法。

相似文献

1
Linking Cancer Clinical Trials to their Result Publications.
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:642-651. eCollection 2024.
3
A web-based tool for automatically linking clinical trials to their publications.
J Am Med Inform Assoc. 2022 Apr 13;29(5):822-830. doi: 10.1093/jamia/ocab290.
4
Automatic categorization of self-acknowledged limitations in randomized controlled trial publications.
J Biomed Inform. 2024 Apr;152:104628. doi: 10.1016/j.jbi.2024.104628. Epub 2024 Mar 26.
10

本文引用的文献

1
A web-based tool for automatically linking clinical trials to their publications.
J Am Med Inform Assoc. 2022 Apr 13;29(5):822-830. doi: 10.1093/jamia/ocab290.
2
AMMU: A survey of transformer-based biomedical pretrained language models.
J Biomed Inform. 2022 Feb;126:103982. doi: 10.1016/j.jbi.2021.103982. Epub 2021 Dec 31.
3
An analysis of the rates of discontinuation and non-publication of colorectal cancer clinical trials.
Int J Colorectal Dis. 2021 Nov;36(11):2529-2532. doi: 10.1007/s00384-021-03972-0. Epub 2021 Jun 10.
4
Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.
CA Cancer J Clin. 2021 May;71(3):209-249. doi: 10.3322/caac.21660. Epub 2021 Feb 4.
5
Obstacles to the reuse of study metadata in ClinicalTrials.gov.
Sci Data. 2020 Dec 18;7(1):443. doi: 10.1038/s41597-020-00780-z.
6
Rates of Discontinuation and Nonpublication of Head and Neck Cancer Randomized Clinical Trials.
JAMA Otolaryngol Head Neck Surg. 2020 Feb 1;146(2):176-182. doi: 10.1001/jamaoto.2019.3967.
7
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
8
Some data quality issues at ClinicalTrials.gov.
Trials. 2019 Jun 24;20(1):378. doi: 10.1186/s13063-019-3408-2.
10
Automatically Linking Registered Clinical Trials to their Published Results with Deep Highway Networks.
AMIA Jt Summits Transl Sci Proc. 2018 May 18;2017:54-63. eCollection 2018.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验