Suppr超能文献

基于语料库的关联为医学术语提供了额外的形态变体。

Corpus-based associations provide additional morphological variants to medical terminologies.

作者信息

Zweigenbaum Pierre, Grabar Natalia

机构信息

Mission de recherche en Sciences et Technologies de l'Information Médicale, STIM/DPA/DSI, Assistance Publique - Hôpitaux de Paris & ERM 202, INSERM, France.

出版信息

AMIA Annu Symp Proc. 2003;2003:768-72.

Abstract

Knowledge of morphologically derived words, as provided for medical English by the UMLS Specialist Lexicon, is useful to detect term variants for automated coding and indexing. For most other languages though, no comparable morphological knowledge base is available. We therefore endeavored to design general methods to help collect such knowledge for a given language. We propose here a method for discovering derived words in text corpora and apply it to a French medical corpus. To evaluate this method, we study its ability to suggest derived adjectives for 2,297 nouns found in the SNOMED nomenclature, which itself specifies adjectival equivalents for some of its terms. 74% of the proposed adjectives are judged correct (precision) and cover 16% of these nouns (recall), a larger amount than what SNOMED already specifies. Furthermore, the corpus suggests additional adjectives which can increase SNOMED's by 76%. We conclude that such a method can help speed up the construction of a morphological knowledge base which can increase the number of term variants in an existing controlled vocabulary.

摘要

由UMLS专业词典提供的、关于医学英语中形态衍生词的知识,对于检测用于自动编码和索引的术语变体很有用。然而,对于大多数其他语言来说,没有类似的形态知识库。因此,我们努力设计通用方法来帮助收集给定语言的此类知识。我们在此提出一种在文本语料库中发现衍生词的方法,并将其应用于一个法语医学语料库。为了评估该方法,我们研究了它为在SNOMED术语表中找到的2297个名词建议衍生形容词的能力,而SNOMED本身为其一些术语指定了形容词等价词。所建议形容词中有74%被判定为正确(精确率),覆盖了这些名词的16%(召回率),这一数量比SNOMED已经指定的要多。此外,该语料库还建议了额外的形容词,这可以使SNOMED的形容词数量增加76%。我们得出结论,这样一种方法有助于加快形态知识库的构建,而这可以增加现有受控词汇表中术语变体的数量。

相似文献

9
UMLF: a unified medical lexicon for French.UMLF:法语统一医学词汇表。
Int J Med Inform. 2005 Mar;74(2-4):119-24. doi: 10.1016/j.ijmedinf.2004.03.010.

本文引用的文献

9
The nature of lexical knowledge.词汇知识的本质。
Methods Inf Med. 1998 Nov;37(4-5):353-60.
10
Medical dictionaries for patient encoding systems: a methodology.用于患者编码系统的医学词典:一种方法
Artif Intell Med. 1998 Sep-Oct;14(1-2):201-14. doi: 10.1016/s0933-3657(98)00023-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验