Poomagal S, Malar B, Ranganayaki E M, Deepika K, Dheepak G
PSG College of Technology, Coimbatore, Tamilnadu India.
SN Comput Sci. 2022;3(6):422. doi: 10.1007/s42979-022-01305-8. Epub 2022 Aug 6.
Classifying product reviews is one of the tasks in Natural Language Processing by which the sentiment of the reviewer towards a product can be identified. This identification is useful for the growth of the business by increasing the number of satisfied customers through product quality improvement. Bigram models are more popular in performing this classification since it considers the occurrence of two words consecutively in the reviews. In the existing works on bigram models, semantically similar words to the words present in bigrams are not considered. As the reviewers use different words with the same meaning to express their feeling, we proposed improved bigram models in which semantically similar words to the words in bigrams are also used for classifying the reviews. In the proposed models, sentiment polarity thesaurus is constructed by including sentiment words and their synonyms. The combinations of constructed thesaurus, Synset and Word2Vec are used for extracting synonyms for the words in the reviews. Performance of the proposed models is compared with the traditional bigram model and state-of-the-art methods. It is observed from the results that our models are able to achieve better performance than traditional model and recent methods.
对产品评论进行分类是自然语言处理中的任务之一,通过该任务可以识别评论者对产品的情感。这种识别对于企业的发展很有用,因为它可以通过提高产品质量来增加满意客户的数量。双词模型在执行此分类时更受欢迎,因为它考虑了评论中连续出现的两个单词。在现有的双词模型研究中,没有考虑与双词中出现的单词语义相似的词。由于评论者使用不同但意思相同的词来表达他们的感受,我们提出了改进的双词模型,其中与双词中的词语义相似的词也用于对评论进行分类。在所提出的模型中,通过包含情感词及其同义词来构建情感极性同义词库。构建的同义词库、同义词集和词向量的组合用于提取评论中单词的同义词。将所提出模型的性能与传统双词模型和最新方法进行了比较。从结果中可以看出,我们的模型能够比传统模型和最近的方法取得更好的性能。