College of Information Engineering, Sichuan Agricultural University, Ya'an 625000, China.
The Lab of Agricultural Information Engineering, Sichuan Key Laboratory, Ya'an 625000, China.
Int J Environ Res Public Health. 2022 Oct 19;19(20):13520. doi: 10.3390/ijerph192013520.
Nowadays, tourists increasingly prefer to check the reviews of attractions before traveling to decide whether to visit them or not. To respond to the change in the way tourists choose attractions, it is important to classify the reviews of attractions with high precision. In addition, more and more tourists like to use emojis to express their satisfaction or dissatisfaction with the attractions. In this paper, we built a dataset for Chinese attraction evaluation incorporating emojis (CAEIE) and proposed an explicitly n-gram masking method to enhance the integration of coarse-grained information into a pre-training (ERNIE-Gram) and Text Graph Convolutional Network (textGCN) (E2G) model to classify the dataset with a high accuracy. The E2G preprocesses the text and feeds it to ERNIE-Gram and TextGCN. ERNIE-Gram was trained using its unique mask mechanism to obtain the final probabilities. TextGCN used the dataset to construct heterogeneous graphs with comment text and words, which were trained to obtain a representation of the document output category probabilities. The two probabilities were calculated to obtain the final results. To demonstrate the validity of the E2G model, this paper was compared with advanced models. After experiments, it was shown that E2G had a good classification effect on the CAEIE dataset, and the accuracy of classification was up to 97.37%. Furthermore, the accuracy of E2G was 1.37% and 1.35% ahead of ERNIE-Gram and TextGCN, respectively. In addition, two sets of comparison experiments were conducted to verify the performance of TextGCN and TextGAT on the CAEIE dataset. The final results showed that ERNIE and ERNIE-Gram combined TextGCN and TextGAT, respectively, and TextGCN performed 1.6% and 2.15% ahead. This paper compared the effects of eight activation functions on the second layer of the TextGCN and the activation-function-rectified linear unit 6 (RELU6) with the best results based on experiments.
如今,游客在旅行前越来越倾向于查看景点的评价,以决定是否前往。为了应对游客选择景点方式的变化,精确地对景点评价进行分类是很重要的。此外,越来越多的游客喜欢使用表情符号来表达他们对景点的满意或不满。在本文中,我们构建了一个包含表情符号的中文景点评价数据集(CAEIE),并提出了一种显式 n 元组屏蔽方法,以增强粗粒度信息与预训练(ERNIE-Gram)和文本图卷积网络(textGCN)(E2G)模型的集成,从而以高精度对数据集进行分类。E2G 预处理文本,并将其输入到 ERNIE-Gram 和 TextGCN 中。ERNIE-Gram 通过其独特的屏蔽机制进行训练,以获得最终的概率。TextGCN 使用数据集构建带有评论文本和单词的异构图,并对其进行训练,以获得文档输出类概率的表示。计算这两个概率以获得最终结果。为了验证 E2G 模型的有效性,本文与先进模型进行了比较。实验结果表明,E2G 对 CAEIE 数据集有很好的分类效果,分类准确率高达 97.37%。此外,E2G 的分类准确率比 ERNIE-Gram 和 TextGCN 分别高出 1.37%和 1.35%。此外,还进行了两组对比实验,以验证 TextGCN 和 TextGAT 在 CAEIE 数据集上的性能。最终结果表明,ERNIE 和 ERNIE-Gram 分别结合了 TextGCN 和 TextGAT,TextGCN 分别领先了 1.6%和 2.15%。本文比较了在 TextGCN 的第二层上使用八种激活函数的效果,以及实验结果表明激活函数-修正线性单元 6(RELU6)的效果最好。