Abed Saad Adnan, Tiun Sabrina, Omar Nazlia
Knowledge Technology Research Group (KT), Centre for Artificial Intelligent (CAIT), Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia.
PLoS One. 2015 Sep 30;10(9):e0136614. doi: 10.1371/journal.pone.0136614. eCollection 2015.
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.
词义消歧(WSD)是一项通过考虑上下文来确定在某个特定用法中歧义单词(具有多种含义的单词)所选用的具体词义的任务。如果一个句子包含歧义单词,那么这个句子就被认为是有歧义的。实际上,任何被归类为有歧义的句子通常都有多种解释,但其中只有一种是正确的解释。我们提出了一种无监督方法,该方法利用基于知识的方法,通过基于斯坦福依存关系生成器(HSDG)的和声搜索算法(HSA)来进行词义消歧。依存关系生成器的作用是解析句子以获取它们的依存关系。而使用HSA的目标是最大化已解析单词集的整体语义相似度。HSA调用语义相似度和相关性度量的组合,即姜氏和康拉特(jcn)以及一种改进的莱斯科算法,来执行HSA适应度函数。我们提出的方法在基准数据集上进行了实验,其结果与当前最先进的WSD方法相当。为了评估依存关系生成器的有效性,我们在不使用解析器但使用单词窗口的情况下执行相同的方法。实证结果表明,所提出的方法能够为所用数据集的大多数实例生成有效的解决方案。