Suppr超能文献

定义生物医学术语并建立其关联:迈向基于跨语言形态语义学的系统。

Defining and relating biomedical terms: towards a cross-language morphosemantics-based system.

作者信息

Namer Fiammetta, Baud Robert

机构信息

UMR 7118 ATILF & Université Nancy2, CLSH, 23 Boulevard Albert 1er, BP3397, 54015 Nancy Cedex, France.

出版信息

Int J Med Inform. 2007 Feb-Mar;76(2-3):226-33. doi: 10.1016/j.ijmedinf.2006.05.001. Epub 2006 Jun 30.

Abstract

This paper addresses the issue of how semantic information can be automatically assigned to compound terms, i.e. both a definition and a set of semantic relations. This is particularly crucial when elaborating multilingual databases and when developing cross-language information retrieval systems. The paper shows how morphosemantics can contribute in the constitution of multilingual lexical networks in biomedical corpora. It presents a system capable of labelling terms with morphologically related words, i.e. providing them with a definition, and grouping them according to synonymy, hyponymy and proximity relations. The approach requires the interaction of three techniques: (1) a language-specific morphosemantic parser, (2) a multilingual table defining basic relations between word roots and (3) a set of language-independent rules to draw up the list of related terms. This approach has been fully implemented for French, on an about 29,000 terms biomedical lexicon, resulting to more than 3000 lexical families. A validation of the results against a manually annotated file by experts of the domain is presented, followed by a discussion of our method.

摘要

本文探讨了如何将语义信息自动赋予复合词,即一个定义和一组语义关系。在构建多语言数据库以及开发跨语言信息检索系统时,这一点尤为关键。本文展示了形态语义学如何有助于构建生物医学语料库中的多语言词汇网络。它提出了一个系统,该系统能够用形态相关的词标注术语,即给它们提供一个定义,并根据同义词、下位词和邻近关系将它们分组。该方法需要三种技术的交互:(1)特定语言的形态语义解析器,(2)定义词根之间基本关系的多语言表,以及(3)一组语言无关的规则来列出相关术语的列表。这种方法已在法语中针对约29000个术语的生物医学词汇进行了全面实施,产生了3000多个词汇家族。文中给出了针对该领域专家手动标注文件对结果进行验证的情况,随后对我们的方法进行了讨论。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验