Deléger Louise, Namer Fiammetta, Zweigenbaum Pierre
INSERM, UMR _S 872, Eq. 20, Les Cordeliers, Paris, F-75006 France.
Stud Health Technol Inform. 2007;129(Pt 1):535-9.
Medical language, as many technical languages, is rich with morphologically complex words, many of which take their roots in Greek and Latin-in which case they are called neoclassical compounds. Morphosemantic analysis can help generate definitions of such words. This paper reports work on the adaptation of a morphosemantic analyzer dedicated to French (DériF) to analyze English medical neoclassical compounds. It presents the principles of this transposition and its current performance. The analyzer was tested on a set of 1,299 compounds extracted from the WHO-ART terminology. 859 could be decomposed and defined, 675 of which successfully. An advantage of this process is that complex linguistic analyses designed for French could be successfully transferred to the analysis of English medical neoclassical compounds. Moreover, the resulting system can produce more complete analyses of English medical compounds than existing ones, including a hierarchical decomposition and semantic gloss of each word.
与许多专业语言一样,医学语言包含大量形态复杂的词汇,其中许多源于希腊语和拉丁语,这类词被称为新古典复合词。形态语义分析有助于生成此类词汇的定义。本文报告了一项关于改编专门用于法语的形态语义分析器(DériF)以分析英语医学新古典复合词的工作。它介绍了这种转换的原理及其当前性能。该分析器在从世界卫生组织药物术语(WHO-ART)中提取的1299个复合词集上进行了测试。其中859个可以分解并定义,其中675个成功完成。这一过程的一个优点是,为法语设计的复杂语言分析可以成功地转移到英语医学新古典复合词的分析中。此外,由此产生的系统能够比现有系统对英语医学复合词进行更完整的分析,包括对每个单词的层次分解和语义注释。