利用 UMLS Metathesaurus 生成的示例对模糊生物医学术语进行消歧。

Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus.

机构信息

Department of Computer Science, University of Sheffield, Sheffield S1 4DP, UK.

出版信息

J Biomed Inform. 2010 Oct;43(5):762-73. doi: 10.1016/j.jbi.2010.06.001. Epub 2010 Jun 10.

DOI:10.1016/j.jbi.2010.06.001

PMID:20541624

Abstract

Researchers have access to a vast amount of information stored in textual documents and there is a pressing need for the development of automated methods to enable and improve access to this resource. Lexical ambiguity, the phenomena in which a word or phrase has more than one possible meaning, presents a significant obstacle to automated text processing. Word Sense Disambiguation (WSD) is a technology that resolves these ambiguities automatically and is an important stage in text understanding. The most accurate approaches to WSD rely on manually labeled examples but this is usually not available and is prohibitively expensive to create. This paper offers a solution to that problem by using information in the UMLS Metathesaurus to automatically generate labeled examples. Two approaches are presented. The first is an extension of existing work (Liu et al., 2002 [1]) and the second a novel approach that exploits information in the UMLS that has not been used for this purpose. The automatically generated examples are evaluated by comparing them against the manually labeled ones in the NLM-WSD data set and are found to outperform the baseline. The examples generated using the novel approach produce an improvement in WSD performance when combined with manually labeled examples.

摘要

研究人员可以访问存储在文本文件中的大量信息，因此迫切需要开发自动化方法来启用和改善对这些资源的访问。词汇歧义是指一个词或短语有不止一种可能的含义，这给自动化文本处理带来了重大障碍。词类消歧（WSD）是一种自动解决这些歧义的技术，是文本理解的重要阶段。最准确的 WSD 方法依赖于手动标记的示例，但通常无法获得这些示例，并且创建这些示例的成本非常高。本文通过使用 UMLS Metathesaurus 中的信息来自动生成标记示例，解决了这个问题。本文提出了两种方法。第一种方法是对现有工作（Liu 等人，2002 [1]）的扩展，第二种方法是利用 UMLS 中尚未用于此目的的信息的新方法。通过将自动生成的示例与 NLM-WSD 数据集中的手动标记示例进行比较，评估了自动生成的示例，并发现它们的性能优于基线。当与手动标记的示例结合使用时，使用新方法生成的示例可以提高 WSD 的性能。

相似文献

Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus.

J Biomed Inform. 2010 Oct;43(5):762-73. doi: 10.1016/j.jbi.2010.06.001. Epub 2010 Jun 10.

Graph-based word sense disambiguation of biomedical documents.

Bioinformatics. 2010 Nov 15;26(22):2889-96. doi: 10.1093/bioinformatics/btq555. Epub 2010 Oct 7.

Disambiguation in the biomedical domain: the role of ambiguity type.

J Biomed Inform. 2010 Dec;43(6):972-81. doi: 10.1016/j.jbi.2010.08.009. Epub 2010 Sep 9.

Collocation analysis for UMLS knowledge-based word sense disambiguation.

BMC Bioinformatics. 2011 Jun 9;12 Suppl 3(Suppl 3):S4. doi: 10.1186/1471-2105-12-S3-S4.

Determining the difficulty of Word Sense Disambiguation.

J Biomed Inform. 2014 Feb;47:83-90. doi: 10.1016/j.jbi.2013.09.009. Epub 2013 Sep 26.

A tool for sharing annotated research data: the "Category 0" UMLS (Unified Medical Language System) vocabularies.

BMC Med Inform Decis Mak. 2003 Jun 16;3:6. doi: 10.1186/1472-6947-3-6.

Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation.

BMC Bioinformatics. 2011 Jun 2;12:223. doi: 10.1186/1471-2105-12-223.

Exploiting domain information for Word Sense Disambiguation of medical documents.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):235-40. doi: 10.1136/amiajnl-2011-000415. Epub 2011 Sep 7.

Developing a test collection for biomedical word sense disambiguation.

Proc AMIA Symp. 2001:746-50.

Disambiguation of biomedical text using diverse sources of information.

BMC Bioinformatics. 2008 Nov 19;9 Suppl 11(Suppl 11):S7. doi: 10.1186/1471-2105-9-S11-S7.

引用本文的文献

Identifying Medical Concepts and Semantic Types in Lay Vocabularies of Health Consumers Who are Concerned with Diabetes on Social Media Using the UMLS and NLP.

Proc COMPSAC. 2024 Jul;2024:862-869. doi: 10.1109/compsac61105.2024.00119. Epub 2024 Aug 26.

Evaluating the Effectiveness of NoteAid in a Community Hospital Setting: Randomized Trial of Electronic Health Record Note Comprehension Interventions With Patients.

J Med Internet Res. 2021 May 13;23(5):e26354. doi: 10.2196/26354.

Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

AMIA Annu Symp Proc. 2012;2012:1004-13. Epub 2012 Nov 3.

Exploiting domain information for Word Sense Disambiguation of medical documents.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):235-40. doi: 10.1136/amiajnl-2011-000415. Epub 2011 Sep 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用 UMLS Metathesaurus 生成的示例对模糊生物医学术语进行消歧。

Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus.

机构信息

Department of Computer Science, University of Sheffield, Sheffield S1 4DP, UK.

出版信息

J Biomed Inform. 2010 Oct;43(5):762-73. doi: 10.1016/j.jbi.2010.06.001. Epub 2010 Jun 10.

DOI:10.1016/j.jbi.2010.06.001

PMID:20541624

Abstract

摘要

利用 UMLS Metathesaurus 生成的示例对模糊生物医学术语进行消歧。

Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用 UMLS Metathesaurus 生成的示例对模糊生物医学术语进行消歧。

Disambiguation of ambiguous biomedical terms using examples generated from the UMLS Metathesaurus.

机构信息

出版信息

相似文献

引用本文的文献