Suppr超能文献

使用连体网络评估生物医学词嵌入以在统一医学语言系统(UMLS)元词表中大规模进行词汇对齐

Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks.

作者信息

Bajaj Goonmeet, Nguyen Vinh, Wijesiriwardene Thilini, Yip Hong Yung, Javangula Vishesh, Parthasarathy Srinivasan, Sheth Amit, Bodenreider Olivier

机构信息

The Ohio State University.

National Library of Medicine.

出版信息

Proc Conf Assoc Comput Linguist Meet. 2022 May;2022:82-87. doi: 10.18653/v1/2022.insights-1.11.

Abstract

Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonymy prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still outperform the Siamese Networks initialized with embedding extracted from biomedical BERT model.

摘要

最近的工作使用了一个以BioWordVec嵌入(分布式词嵌入)初始化的暹罗网络,用于预测生物医学术语之间的同义词,以自动化统一医学语言系统(UMLS)元词库构建过程的一部分。我们通过使用不同特征提取方法从每个生物医学BERT模型中提取的嵌入替换BioWordVec嵌入,评估了从九个不同的基于生物医学BERT的模型中提取的上下文词嵌入在UMLS中进行同义词预测的情况。令人惊讶的是,我们发现以BioWordVec嵌入初始化的暹罗网络仍然优于以从生物医学BERT模型中提取的嵌入初始化的暹罗网络。

相似文献

1
Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks.
Proc Conf Assoc Comput Linguist Meet. 2022 May;2022:82-87. doi: 10.18653/v1/2022.insights-1.11.
2
Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts.
J Am Med Inform Assoc. 2020 Oct 1;27(10):1538-1546. doi: 10.1093/jamia/ocaa136.
3
Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy.
Int Conf Knowl Syst Eng. 2020 Nov;2020:281-286. doi: 10.1109/kse50997.2020.9287797. Epub 2020 Dec 16.
4
Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus.
Proc Int World Wide Web Conf. 2022 Apr;2022:1037-1046. doi: 10.1145/3485447.3511946. Epub 2022 Apr 25.
5
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.
Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.
8
Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning.
J Am Med Inform Assoc. 2023 Nov 17;30(12):1887-1894. doi: 10.1093/jamia/ocad152.
9
Improved biomedical word embeddings in the transformer era.
J Biomed Inform. 2021 Aug;120:103867. doi: 10.1016/j.jbi.2021.103867. Epub 2021 Jul 18.

引用本文的文献

1
Linking Symptom Inventories Using Semantic Textual Similarity.
J Neurotrauma. 2025 Jun;42(11-12):1008-1020. doi: 10.1089/neu.2024.0301. Epub 2025 Apr 9.
3
Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning.
J Am Med Inform Assoc. 2023 Nov 17;30(12):1887-1894. doi: 10.1093/jamia/ocad152.
4
Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus.
Proc Int World Wide Web Conf. 2022 Apr;2022:1037-1046. doi: 10.1145/3485447.3511946. Epub 2022 Apr 25.

本文引用的文献

2
Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.
Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.
3
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
4
BioWordVec, improving biomedical word embeddings with subword information and MeSH.
Sci Data. 2019 May 10;6(1):52. doi: 10.1038/s41597-019-0055-0.
5
The Unified Medical Language System (UMLS): integrating biomedical terminology.
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验