Grabar Natalia, Hamon Thierry
Université Pierre et Marie Curie, Paris, France.
Stud Health Technol Inform. 2010;160(Pt 2):1015-9.
Acquisition and enrichment of lexical resources is an important research area for the computational linguistics. We propose a method for inducing a lexicon of synonyms and for its weighting in order to establish its reliability. The method is based on the analysis of syntactic structure of complex terms. We apply and evaluate the approach on three biomedical terminologies (MeSH, Snomed Int, Snomed CT). Between 7.7 and 33.6% of the induced synonyms are ambiguous and cooccur with other semantic relations. A virtual reference allows to validate 9 to 14% of the induced synonyms.
词汇资源的获取与丰富是计算语言学的一个重要研究领域。我们提出了一种用于归纳同义词词典并对其进行加权以确定其可靠性的方法。该方法基于对复杂术语句法结构的分析。我们在三种生物医学术语集(医学主题词表、国际疾病分类标准术语集、临床术语集)上应用并评估了该方法。归纳出的同义词中有7.7%至33.6%是模糊的,并且与其他语义关系同时出现。一个虚拟参考可以验证9%至14%的归纳同义词。