Suppr超能文献

在一个大型的面向概念的叙词表中发现遗漏的同义词。

Discovering missed synonymy in a large concept-oriented Metathesaurus.

作者信息

Hole W T, Srinivasan S

机构信息

National Library of Medicine, Bethesda, MD, USA.

出版信息

Proc AMIA Symp. 2000:354-8.

Abstract

The Unified Medical Language System (UMLS) [1, 2] Metathesuarus is concept-oriented; its goal is to unite all names with identical meaning in a single Concept. The names come from its constituent vocabularies or "sources"--a wide variety of biomedical terminologies including many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic, research, full-text, and expert systems. Many offer little definitional information, and many are not themselves concept-oriented, so identifying synonymy is a challenging semantic task [3]. The rapidly increasing size of the Metathesaurus makes the task daunting, demanding effective computational support; there are more than 1.5 million names for 730,000 concepts in the January 2000 release. Vocabularies are added and updated using sophisticated lexical matching, selective algorithms, and expert review [4, 5, 6]. Yet the result is imperfect; we have discovered and corrected missed synonymy in approximately 1% of previously released concepts each year. This paper reviews general methods for finding missed synonymy and describes several specific novel approaches which we have found effective.

摘要

统一医学语言系统(UMLS)[1, 2]元词表是以概念为导向的;其目标是将所有具有相同含义的名称统一到一个单一概念中。这些名称来自其组成词汇表或“来源”——各种各样的生物医学术语,包括许多用于患者记录、行政健康数据、书目、研究、全文和专家系统的受控词汇表和分类法。许多术语提供的定义信息很少,而且许多本身并非以概念为导向,因此识别同义词是一项具有挑战性的语义任务[3]。元词表规模的迅速增长使得这项任务艰巨,需要有效的计算支持;在2000年1月发布的版本中,730,000个概念有超过150万个名称。词汇表通过复杂的词汇匹配、选择性算法和专家评审来添加和更新[4, 5, 6]。然而结果并不完美;我们每年都会在大约1%的先前发布的概念中发现并纠正遗漏的同义词。本文回顾了查找遗漏同义词的一般方法,并描述了我们发现有效的几种具体新颖方法。

引用本文的文献

1
A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.
AMIA Annu Symp Proc. 2024 Jan 11;2023:977-986. eCollection 2023.
2
A review of auditing techniques for the Unified Medical Language System.
J Am Med Inform Assoc. 2020 Oct 1;27(10):1625-1638. doi: 10.1093/jamia/ocaa108.
3
A new synonym-substitution method to enrich the human phenotype ontology.
BMC Bioinformatics. 2017 Oct 10;18(1):446. doi: 10.1186/s12859-017-1858-7.
4
Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology.
J Med Syst. 2014 Oct;38(10):134. doi: 10.1007/s10916-014-0134-x. Epub 2014 Sep 2.
5
A review of medication reconciliation issues and experiences with clinical staff and information systems.
Appl Clin Inform. 2010 Dec 1;1(4):442-61. doi: 10.4338/ACI-2010-02-R-0010. Print 2010.
6
Auditing SNOMED Integration into the UMLS for Duplicate Concepts.
AMIA Annu Symp Proc. 2010 Nov 13;2010:321-5.
7
Determining correspondences between high-frequency MedDRA concepts and SNOMED: a case study.
BMC Med Inform Decis Mak. 2010 Oct 28;10:66. doi: 10.1186/1472-6947-10-66.
8
The UMLS-CORE project: a study of the problem list terminologies used in large healthcare institutions.
J Am Med Inform Assoc. 2010 Nov-Dec;17(6):675-80. doi: 10.1136/jamia.2010.007047.
9
Expanding the extent of a UMLS semantic type via group neighborhood auditing.
J Am Med Inform Assoc. 2009 Sep-Oct;16(5):746-57. doi: 10.1197/jamia.M2951. Epub 2009 Jun 30.
10
The Neighborhood Auditing Tool: a hybrid interface for auditing the UMLS.
J Biomed Inform. 2009 Jun;42(3):468-89. doi: 10.1016/j.jbi.2009.01.006.

本文引用的文献

1
Merging terminologies.
Medinfo. 1995;8 Pt 1:162-6.
2
The Unified Medical Language System.
Methods Inf Med. 1993 Aug;32(4):281-91. doi: 10.1055/s-0038-1634945.
3
Lexical methods for managing variation in biomedical terminologies.
Proc Annu Symp Comput Appl Med Care. 1994:235-9.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验