Alsubari Saleh Nagi, Deshmukh Sachin N, Al-Adhaileh Mosleh Hmoud, Alsaade Fawaz Waselalla, Aldhyani Theyazn H H
Department of Computer Science & Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, India.
Deanship of E-Learning and Distance Education King Faisal University Saudi Arabia, Al-Ahsa, Saudi Arabia.
Appl Bionics Biomech. 2021 Apr 14;2021:5522574. doi: 10.1155/2021/5522574. eCollection 2021.
Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review's text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review's text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.
在线产品评论对电子商务业务的成败起着重要作用。在购买产品或服务之前,购物者通常会查看以前顾客发布的在线评论,以获取产品细节的推荐并做出购买决策。然而,发布虚假评论可能会提升或阻碍特定的电子商务产品,这些虚假评论可能由被称为欺诈者的人撰写。这些评论会给电子商务企业造成经济损失,并误导消费者做出错误决定去寻找替代产品。因此,电子商务企业最终需要开发一个虚假评论检测系统。所提出的方法使用了四个多领域的标准虚假评论数据集,包括酒店、餐厅、Yelp和亚马逊。此外,还执行了诸如停用词删除、标点删除和分词等预处理方法,以及填充序列方法,以便在训练、验证和测试模型期间使输入序列具有固定长度。由于该方法使用不同大小的数据集,借助所提出模型的一个组件词嵌入层,开发并创建了评论文本的n-gram特征的各种输入词嵌入矩阵。分别使用CNN技术的卷积层和最大池化层进行降维和特征提取。基于门机制,将LSTM层与CNN技术相结合,用于学习和处理评论文本的n-gram特征的上下文信息。最后,作为所提出模型最后一层的 sigmoid 激活函数接收来自前一层的输入序列,并对评论文本进行真假二元分类任务。在本文中,所提出的CNN-LSTM模型在两种类型的实验中进行了评估,即域内实验和跨域实验。对于域内实验,该模型分别应用于每个数据集,而在跨域实验的情况下,所有数据集被收集并放入一个单一数据框中进行整体评估。该模型在域内实验数据集上的测试结果在餐厅、酒店、Yelp和亚马逊数据集的准确率方面分别为77%、85%、86%和87%。关于跨域实验,所提出的模型达到了89%的准确率。此外,基于准确率指标对域内实验结果与现有方法进行了比较分析,并且观察到所提出的模型优于比较方法。