Zhao Yuxia, Mamat Mahpirat, Aysa Alimjan, Ubul Kurban
School of Computer Science and Technology, Xinjiang University, Ürümqi, 830046, Xinjiang, China.
School of Mathematics and Computer Applications, Shangluo University, Shangluo, 726000, Shaanxi, China.
Sci Rep. 2024 Jul 17;14(1):16563. doi: 10.1038/s41598-024-62269-8.
Implicit sentiment identification has become the classic challenge in text mining due to its lack of sentiment words. Recently, graph neural network (GNN) has made great progress in natural language processing (NLP) because of its powerful feature capture ability, but there are still two problems with the current method. On the one hand, the graph structure constructed for implicit sentiment text is relatively single, without comprehensively considering the information of the text, and it is more difficult to understand the semantics. On the other hand, the constructed initial static graph structure is more dependent on human labor and domain expertise, and the introduced errors cannot be corrected. To solve these problems, we introduce a dynamic graph structure framework (SIF) based on the complementarity of semantic and structural information. Specifically, for the first problem, SIF integrates the semantic and structural information of the text, and constructs two graph structures, structural information graph and semantic information graph, respectively, based on specialized knowledge, which complements the information between the two graph structures, provides rich semantic features for the downstream identification task, and helps to understanding of the contextual information between implicit sentiment semantics. To deal with the second issue, SIF dynamically learns the initial static graph structure to eliminate the noise information in the graph structure, preventing noise accumulation that affects the performance of the downstream identification task. We compare SIF with mainstream natural language processing methods in three publicly available datasets, all of which outperform the benchmark model. The accuracy on the Puns of day dataset, SemEval-2021 task 7 dataset, and Reddit dataset reaches 95.73%, 85.37%, and 65.36%, respectively. The experimental results demonstrate a good application scenario for our proposed method on implicit sentiment identification tasks.
由于缺乏情感词,隐含情感识别已成为文本挖掘中的经典挑战。近年来,图神经网络(GNN)因其强大的特征捕捉能力在自然语言处理(NLP)领域取得了巨大进展,但当前方法仍存在两个问题。一方面,为隐含情感文本构建的图结构相对单一,没有全面考虑文本信息,语义理解难度较大。另一方面,构建的初始静态图结构对人工和领域专业知识的依赖较大,引入的错误无法纠正。为了解决这些问题,我们引入了一种基于语义和结构信息互补性的动态图结构框架(SIF)。具体来说,针对第一个问题,SIF整合了文本的语义和结构信息,并基于专业知识分别构建了两个图结构,即结构信息图和语义信息图,这两个图结构之间的信息相互补充,为下游识别任务提供了丰富的语义特征,有助于理解隐含情感语义之间的上下文信息。为了解决第二个问题,SIF动态学习初始静态图结构,以消除图结构中的噪声信息,防止噪声积累影响下游识别任务的性能。我们在三个公开可用的数据集上将SIF与主流自然语言处理方法进行了比较,结果均优于基准模型。在“每日双关语”数据集、SemEval - 2021任务7数据集和Reddit数据集上的准确率分别达到了95.73%、85.37%和65.36%。实验结果表明,我们提出的方法在隐含情感识别任务中具有良好的应用前景。