在一个大型的面向概念的叙词表中发现遗漏的同义词。

Discovering missed synonymy in a large concept-oriented Metathesaurus.

作者信息

Hole W T, Srinivasan S

机构信息

National Library of Medicine, Bethesda, MD, USA.

出版信息

Proc AMIA Symp. 2000:354-8.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2244099/

Abstract

The Unified Medical Language System (UMLS) [1, 2] Metathesuarus is concept-oriented; its goal is to unite all names with identical meaning in a single Concept. The names come from its constituent vocabularies or "sources"--a wide variety of biomedical terminologies including many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic, research, full-text, and expert systems. Many offer little definitional information, and many are not themselves concept-oriented, so identifying synonymy is a challenging semantic task [3]. The rapidly increasing size of the Metathesaurus makes the task daunting, demanding effective computational support; there are more than 1.5 million names for 730,000 concepts in the January 2000 release. Vocabularies are added and updated using sophisticated lexical matching, selective algorithms, and expert review [4, 5, 6]. Yet the result is imperfect; we have discovered and corrected missed synonymy in approximately 1% of previously released concepts each year. This paper reviews general methods for finding missed synonymy and describes several specific novel approaches which we have found effective.

摘要

统一医学语言系统（UMLS）[1, 2]元词表是以概念为导向的；其目标是将所有具有相同含义的名称统一到一个单一概念中。这些名称来自其组成词汇表或“来源”——各种各样的生物医学术语，包括许多用于患者记录、行政健康数据、书目、研究、全文和专家系统的受控词汇表和分类法。许多术语提供的定义信息很少，而且许多本身并非以概念为导向，因此识别同义词是一项具有挑战性的语义任务[3]。元词表规模的迅速增长使得这项任务艰巨，需要有效的计算支持；在2000年1月发布的版本中，730,000个概念有超过150万个名称。词汇表通过复杂的词汇匹配、选择性算法和专家评审来添加和更新[4, 5, 6]。然而结果并不完美；我们每年都会在大约1%的先前发布的概念中发现并纠正遗漏的同义词。本文回顾了查找遗漏同义词的一般方法，并描述了我们发现有效的几种具体新颖方法。

相似文献

1

Discovering missed synonymy in a large concept-oriented Metathesaurus.在一个大型的面向概念的叙词表中发现遗漏的同义词。

Proc AMIA Symp. 2000:354-8.

2

Tracking meaning over time in the UMLS Metathesaurus.在统一医学语言系统（UMLS）元词表中随时间追踪含义。

Proc AMIA Symp. 2002:622-6.

3

The UMLS Metathesaurus: representing different views of biomedical concepts.统一医学语言系统元词表：呈现生物医学概念的不同视图。

Bull Med Libr Assoc. 1993 Apr;81(2):217-22.

4

Battling Scylla and Charybdis: the search for redundancy and ambiguity in the 2001 UMLS metathesaurus.与斯库拉和卡律布狄斯搏斗：探寻2001年《统一医学语言系统》元词表中的冗余和歧义

Proc AMIA Symp. 2001:120-4.

5

Evaluating the coverage of controlled health data terminologies: report on the results of the NLM/AHCPR large scale vocabulary test.评估受控健康数据术语的覆盖范围：国立医学图书馆/卫生保健政策与研究局大规模词汇测试结果报告

J Am Med Inform Assoc. 1997 Nov-Dec;4(6):484-500. doi: 10.1136/jamia.1997.0040484.

6

Beyond synonymy: exploiting the UMLS semantics in mapping vocabularies.超越同义词：在映射词汇表中利用统一医学语言系统语义

Proc AMIA Symp. 1998:815-9.

7

Achieving "source transparency" in the UMLS Metathesaurus.在统一医学语言系统元词表中实现“源透明度”。

Stud Health Technol Inform. 2004;107(Pt 1):371-5.

8

A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies.一种用于共享带注释研究数据的工具：“第0类”统一医学语言系统（UMLS）词汇表。

BMC Med Inform Decis Mak. 2003 Jun 16;3:6. doi: 10.1186/1472-6947-3-6.

9

The Unified Medical Language System (UMLS): integrating biomedical terminology.统一医学语言系统（UMLS）：整合生物医学术语。

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.

10

Consistency across the hierarchies of the UMLS Semantic Network and Metathesaurus.美国国立医学图书馆医学主题词表语义网络和元词表各层次之间的一致性。

J Biomed Inform. 2003 Dec;36(6):450-61. doi: 10.1016/j.jbi.2003.11.001.

引用本文的文献

1

A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.基于图卷积网络的方法揭示 UMLS Metathesaurus 中未对齐的同义术语。

AMIA Annu Symp Proc. 2024 Jan 11;2023:977-986. eCollection 2023.

2

A review of auditing techniques for the Unified Medical Language System.《统一医学语言系统的审计技术综述》

J Am Med Inform Assoc. 2020 Oct 1;27(10):1625-1638. doi: 10.1093/jamia/ocaa108.

3

A new synonym-substitution method to enrich the human phenotype ontology.一种丰富人类表型本体的新同义词替换方法。

BMC Bioinformatics. 2017 Oct 10;18(1):446. doi: 10.1186/s12859-017-1858-7.

4

Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology.临床术语到SNOMED-CT的自动映射。一种用于病理程序编码的应用。

J Med Syst. 2014 Oct;38(10):134. doi: 10.1007/s10916-014-0134-x. Epub 2014 Sep 2.

5

A review of medication reconciliation issues and experiences with clinical staff and information systems.药物重整问题综述及临床人员和信息系统的相关经验

Appl Clin Inform. 2010 Dec 1;1(4):442-61. doi: 10.4338/ACI-2010-02-R-0010. Print 2010.

6

Auditing SNOMED Integration into the UMLS for Duplicate Concepts.审核将SNOMED整合到统一医学语言系统（UMLS）中以处理重复概念的情况。

AMIA Annu Symp Proc. 2010 Nov 13;2010:321-5.

7

Determining correspondences between high-frequency MedDRA concepts and SNOMED: a case study.确定 MedDRA 高频概念与 SNOMED 的对应关系：案例研究。

BMC Med Inform Decis Mak. 2010 Oct 28;10:66. doi: 10.1186/1472-6947-10-66.

8

The UMLS-CORE project: a study of the problem list terminologies used in large healthcare institutions.UMLS-CORE 项目：大型医疗机构中使用的问题列表术语研究。

J Am Med Inform Assoc. 2010 Nov-Dec;17(6):675-80. doi: 10.1136/jamia.2010.007047.

9

Expanding the extent of a UMLS semantic type via group neighborhood auditing.通过群组邻域审核扩展 UMLS 语义类型的范围。

J Am Med Inform Assoc. 2009 Sep-Oct;16(5):746-57. doi: 10.1197/jamia.M2951. Epub 2009 Jun 30.

10

The Neighborhood Auditing Tool: a hybrid interface for auditing the UMLS.社区审核工具：一种用于审核统一医学语言系统（UMLS）的混合界面。

J Biomed Inform. 2009 Jun;42(3):468-89. doi: 10.1016/j.jbi.2009.01.006.

本文引用的文献

1

Merging terminologies.合并术语。

Medinfo. 1995;8 Pt 1:162-6.

2

The Unified Medical Language System.统一医学语言系统

Methods Inf Med. 1993 Aug;32(4):281-91. doi: 10.1055/s-0038-1634945.

3

Lexical methods for managing variation in biomedical terminologies.用于管理生物医学术语变异的词汇方法。

Proc Annu Symp Comput Appl Med Care. 1994:235-9.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验