Suppr超能文献

基于BERT的生物医学实体规范化排序

BERT-based Ranking for Biomedical Entity Normalization.

作者信息

Ji Zongcheng, Wei Qiang, Xu Hua

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.

Abstract

Developing high-performance entity normalization algorithms that can alleviate the term variation problem is of great interest to the biomedical community. Although deep learning-based methods have been successfully applied to biomedical entity normalization, they often depend on traditional context-independent word embeddings. Bidirectional Encoder Representations from Transformers (BERT), BERT for Biomedical Text Mining (BioBERT) and BERT for Clinical Text Mining (ClinicalBERT) were recently introduced to pre-train contextualized word representation models using bidirectional Transformers, advancing the state-of-the-art for many natural language processing tasks. In this study, we proposed an entity normalization architecture by fine-tuning the pre-trained BERT / BioBERT / ClinicalBERT models and conducted extensive experiments to evaluate the effectiveness of the pre-trained models for biomedical entity normalization using three different types of datasets. Our experimental results show that the best fine-tuned models consistently outperformed previous methods and advanced the state-of-the-art for biomedical entity normalization, with up to 1.17% increase in accuracy.

摘要

开发能够缓解术语变化问题的高性能实体归一化算法,引起了生物医学界的极大兴趣。尽管基于深度学习的方法已成功应用于生物医学实体归一化,但它们通常依赖于传统的上下文无关词嵌入。最近引入了来自Transformer的双向编码器表示(BERT)、用于生物医学文本挖掘的BERT(BioBERT)和用于临床文本挖掘的BERT(ClinicalBERT),以使用双向Transformer预训练上下文相关词表示模型,推动了许多自然语言处理任务的技术发展。在本研究中,我们通过微调预训练的BERT/BioBERT/ClinicalBERT模型提出了一种实体归一化架构,并使用三种不同类型的数据集进行了广泛实验,以评估预训练模型在生物医学实体归一化方面的有效性。我们的实验结果表明,最佳微调模型始终优于先前的方法,并推动了生物医学实体归一化的技术发展,准确率提高了1.17%。

相似文献

1
BERT-based Ranking for Biomedical Entity Normalization.基于BERT的生物医学实体规范化排序
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:269-277. eCollection 2020.
7
BioBERT and Similar Approaches for Relation Extraction.BioBERT 及其在关系抽取中的应用。
Methods Mol Biol. 2022;2496:221-235. doi: 10.1007/978-1-0716-2305-3_12.

引用本文的文献

8
Transformers and large language models in healthcare: A review.医疗保健中的变压器和大型语言模型:综述。
Artif Intell Med. 2024 Aug;154:102900. doi: 10.1016/j.artmed.2024.102900. Epub 2024 Jun 5.
9
Ways to make artificial intelligence work for healthcare professionals: correspondence.让人工智能为医疗专业人员服务的方法:通信
Antimicrob Steward Healthc Epidemiol. 2024 Jun 4;4(1):e95. doi: 10.1017/ash.2024.85. eCollection 2024.
10
NeighBERT: Medical Entity Linking Using Relation-Induced Dense Retrieval.NeighBERT:使用关系诱导密集检索的医学实体链接
J Healthc Inform Res. 2024 Jan 18;8(2):353-369. doi: 10.1007/s41666-023-00136-3. eCollection 2024 Jun.

本文引用的文献

3
Enhancing clinical concept extraction with contextual embeddings.利用上下文嵌入增强临床概念提取。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1297-1304. doi: 10.1093/jamia/ocz096.
5
CNN-based ranking for biomedical entity normalization.基于卷积神经网络的生物医学实体标准化排序
BMC Bioinformatics. 2017 Oct 3;18(Suppl 11):385. doi: 10.1186/s12859-017-1805-7.
9
Large-scale linear rankSVM.大规模线性秩支持向量机。
Neural Comput. 2014 Apr;26(4):781-817. doi: 10.1162/NECO_a_00571. Epub 2014 Jan 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验