Noferesti Samira, Shamsfard Mehrnoush
Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran.
PLoS One. 2015 May 11;10(5):e0124993. doi: 10.1371/journal.pone.0124993. eCollection 2015.
Opinion mining is a well-known problem in natural language processing that has attracted increasing attention in recent years. Existing approaches are mainly limited to the identification of direct opinions and are mostly dedicated to explicit opinions. However, in some domains such as medical, the opinions about an entity are not usually expressed by opinion words directly, but they are expressed indirectly by describing the effect of that entity on other ones. Therefore, ignoring indirect opinions can lead to the loss of valuable information and noticeable decline in overall accuracy of opinion mining systems. In this paper, we first introduce the task of indirect opinion mining. Then, we present a novel approach to construct a knowledge base of indirect opinions, called OpinionKB, which aims to be a resource for automatically classifying people's opinions about drugs. Using our approach, we have extracted 896 quadruples of indirect opinions at a precision of 88.08 percent. Furthermore, experiments on drug reviews demonstrate that our approach can achieve 85.25 percent precision in polarity detection task, and outperforms the state-of-the-art opinion mining methods. We also build a corpus of indirect opinions about drugs, which can be used as a basis for supervised indirect opinion mining. The proposed approach for corpus construction achieves the precision of 88.42 percent.
观点挖掘是自然语言处理中一个广为人知的问题,近年来受到了越来越多的关注。现有方法主要局限于直接观点的识别,并且大多致力于显性观点。然而,在医学等一些领域,关于某个实体的观点通常不是直接由观点词表达的,而是通过描述该实体对其他实体的影响来间接表达。因此,忽略间接观点会导致有价值信息的丢失以及观点挖掘系统整体准确率的显著下降。在本文中,我们首先介绍间接观点挖掘任务。然后,我们提出一种新颖的方法来构建一个间接观点知识库,称为OpinionKB,其旨在成为自动分类人们对药物观点的资源。使用我们的方法,我们以88.08%的精度提取了896个间接观点四元组。此外,对药物评论的实验表明,我们的方法在极性检测任务中可以达到85.25%的精度,并且优于当前最先进的观点挖掘方法。我们还构建了一个关于药物的间接观点语料库,可作为监督间接观点挖掘的基础。所提出的语料库构建方法达到了88.42%的精度。