基于图卷积网络的方法揭示 UMLS Metathesaurus 中未对齐的同义术语。

A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.

机构信息

McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX.

Department of Neurology, The University of Texas Health Science Center at Houston, Houston, TX.

出版信息

AMIA Annu Symp Proc. 2024 Jan 11;2023:977-986. eCollection 2023.

PMID:38222357

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10785861/

Abstract

The Unified Medical Language System (UMLS), a large repository of biomedical vocabularies, has been used for supporting various biomedical applications. Ensuring the quality of the UMLS is critical to maintain both the accuracy of its content and the reliability of downstream applications. In this work, we present a Graph Convolutional Network (GCN)-based approach to identify misaligned synonymous terms organized under different UMLS concepts. We used synonymous terms grouped under the same concept as positive samples and top lexically similar terms as negative samples to train the GCN model. We applied the model to a test set and suggested those negative samples predicted to be synonymous as potentially misaligned synonymous terms. A total of 147,625 suggestions were made. A human expert evaluated 100 randomly selected suggestions and agreed with 60 of them. The results indicate that our GCN-based approach shows promise to help improve the synonymy grouping in the UMLS.

摘要

统一医学语言系统（UMLS）是一个大型生物医学词汇库，用于支持各种生物医学应用。确保 UMLS 的质量对于保持其内容的准确性和下游应用的可靠性至关重要。在这项工作中，我们提出了一种基于图卷积网络（GCN）的方法来识别组织在不同 UMLS 概念下的对齐同义词。我们使用同一概念下的同义词作为正样本，以及词汇上最相似的术语作为负样本来训练 GCN 模型。我们将模型应用于测试集，并建议将预测为同义词的那些负样本作为潜在的对齐同义词。共提出了 147625 条建议。一位人类专家评估了 100 条随机选择的建议，并对其中 60 条表示认可。结果表明，我们的基于 GCN 的方法有希望帮助改善 UMLS 中的同义词分组。

相似文献

A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.

AMIA Annu Symp Proc. 2024 Jan 11;2023:977-986. eCollection 2023.

Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.

Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.

A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies.

BMC Med Inform Decis Mak. 2003 Jun 16;3:6. doi: 10.1186/1472-6947-3-6.

Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy.

Int Conf Knowl Syst Eng. 2020 Nov;2020:281-286. doi: 10.1109/kse50997.2020.9287797. Epub 2020 Dec 16.

A Comprehensive Analysis of Five Million UMLS Metathesaurus Terms Using Eighteen Million MEDLINE Citations.

AMIA Annu Symp Proc. 2010 Nov 13;2010:907-11.

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.

J Am Med Inform Assoc. 1997 Nov-Dec;4(6):484-500. doi: 10.1136/jamia.1997.0040484.

The Unified Medical Language System (UMLS): integrating biomedical terminology.

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.

Circular hierarchical relationships in the UMLS: etiology, diagnosis, treatment, complications and prevention.

Proc AMIA Symp. 2001:57-61.

Consistency across the hierarchies of the UMLS Semantic Network and Metathesaurus.

J Biomed Inform. 2003 Dec;36(6):450-61. doi: 10.1016/j.jbi.2003.11.001.

Tracking meaning over time in the UMLS Metathesaurus.

Proc AMIA Symp. 2002:622-6.

本文引用的文献

Identifying Missing IS-A Relations in Orphanet Rare Disease Ontology.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2022 Dec;2022:3274-3279. doi: 10.1109/bibm55620.2022.9995614. Epub 2023 Jan 2.

A deep learning approach to identify missing is-a relations in SNOMED CT.

J Am Med Inform Assoc. 2023 Feb 16;30(3):475-484. doi: 10.1093/jamia/ocac248.

Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy.

Int Conf Knowl Syst Eng. 2020 Nov;2020:281-286. doi: 10.1109/kse50997.2020.9287797. Epub 2020 Dec 16.

Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus.

Proc Int World Wide Web Conf. 2022 Apr;2022:1037-1046. doi: 10.1145/3485447.3511946. Epub 2022 Apr 25.

Leveraging non-lattice subgraphs for suggestion of new concepts for SNOMED CT.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2021 Dec;2021:1805-1812. doi: 10.1109/bibm52615.2021.9669407.

Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus.

Proc Int World Wide Web Conf. 2021 Apr;2021:2672-2683. doi: 10.1145/3442381.3450128. Epub 2021 Apr 19.

The UMLS knowledge sources at 30: indispensable to current research and applications in biomedical informatics.

J Am Med Inform Assoc. 2020 Oct 1;27(10):1499-1501. doi: 10.1093/jamia/ocaa208.

Using UMLS for electronic health data standardization and database design.

J Am Med Inform Assoc. 2020 Oct 1;27(10):1520-1528. doi: 10.1093/jamia/ocaa176.

A review of auditing techniques for the Unified Medical Language System.

J Am Med Inform Assoc. 2020 Oct 1;27(10):1625-1638. doi: 10.1093/jamia/ocaa108.

UMLS users and uses: a current overview.

J Am Med Inform Assoc. 2020 Jul 19;27(10):1606-11. doi: 10.1093/jamia/ocaa084.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于图卷积网络的方法揭示 UMLS Metathesaurus 中未对齐的同义术语。

A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献