Hajek Petr, Barushka Aliaksandr, Munk Michal
Science and Research Centre, Faculty of Economics and Administration, University of Pardubice, Studentska 84, 532 10 Pardubice, Czech Republic.
Department of Computer Science, Constantine the Philosopher University in Nitra, 949 74 Nitra, Slovakia.
Int J Neural Syst. 2021 Oct;31(10):2150013. doi: 10.1142/S0129065721500131. Epub 2021 Feb 10.
Automated sentiment analysis is becoming increasingly recognized due to the growing importance of social media and -commerce platform review websites. Deep neural networks outperform traditional lexicon-based and machine learning methods by effectively exploiting contextual word embeddings to generate dense document representation. However, this representation model is not fully adequate to capture topical semantics and the sentiment polarity of words. To overcome these problems, a novel sentiment analysis model is proposed that utilizes richer document representations of word-emotion associations and topic models, which is the main computational novelty of this study. The sentiment analysis model integrates word embeddings with lexicon-based sentiment and emotion indicators, including negations and emoticons, and to further improve its performance, a topic modeling component is utilized together with a bag-of-words model based on a supervised term weighting scheme. The effectiveness of the proposed model is evaluated using large datasets of Amazon product reviews and hotel reviews. Experimental results prove that the proposed document representation is valid for the sentiment analysis of product and hotel reviews, irrespective of their class imbalance. The results also show that the proposed model improves on existing machine learning methods.
由于社交媒体和电子商务平台评论网站的重要性日益增加,自动情感分析越来越受到认可。深度神经网络通过有效利用上下文词嵌入来生成密集的文档表示,从而优于传统的基于词典和机器学习的方法。然而,这种表示模型并不完全足以捕捉主题语义和词的情感极性。为了克服这些问题,提出了一种新颖的情感分析模型,该模型利用了更丰富的词-情感关联和主题模型的文档表示,这是本研究的主要计算创新点。该情感分析模型将词嵌入与基于词典的情感和情感指标(包括否定词和表情符号)相结合,并且为了进一步提高其性能,还将主题建模组件与基于监督词加权方案的词袋模型一起使用。使用亚马逊产品评论和酒店评论的大型数据集对所提出模型的有效性进行了评估。实验结果证明,所提出的文档表示对于产品和酒店评论的情感分析是有效的,无论它们的类别是否不均衡。结果还表明,所提出的模型优于现有的机器学习方法。