Lv Shaoqing, Dong Jungang, Wang Chichi, Wang Xuanhong, Bao Zhiqiang
School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.
Shaanxi Key Laboratory of Information Communication Network and Security, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.
Sensors (Basel). 2024 May 24;24(11):3365. doi: 10.3390/s24113365.
With the development of deep learning, several graph neural network (GNN)-based approaches have been utilized for text classification. However, GNNs encounter challenges when capturing contextual text information within a document sequence. To address this, a novel text classification model, RB-GAT, is proposed by combining RoBERTa-BiGRU embedding and a multi-head Graph ATtention Network (GAT). First, the pre-trained RoBERTa model is exploited to learn word and text embeddings in different contexts. Second, the Bidirectional Gated Recurrent Unit (BiGRU) is employed to capture long-term dependencies and bidirectional sentence information from the text context. Next, the multi-head graph attention network is applied to analyze this information, which serves as a node feature for the document. Finally, the classification results are generated through a Softmax layer. Experimental results on five benchmark datasets demonstrate that our method can achieve an accuracy of 71.48%, 98.45%, 80.32%, 90.84%, and 95.67% on Ohsumed, R8, MR, 20NG and R52, respectively, which is superior to the existing nine text classification approaches.
随着深度学习的发展,几种基于图神经网络(GNN)的方法已被用于文本分类。然而,GNN在捕获文档序列中的上下文文本信息时遇到了挑战。为了解决这个问题,通过结合RoBERTa-BiGRU嵌入和多头图注意力网络(GAT),提出了一种新颖的文本分类模型RB-GAT。首先,利用预训练的RoBERTa模型在不同上下文中学习单词和文本嵌入。其次,采用双向门控循环单元(BiGRU)从文本上下文中捕获长期依赖关系和双向句子信息。接下来,应用多头图注意力网络来分析此信息,将其作为文档的节点特征。最后,通过Softmax层生成分类结果。在五个基准数据集上的实验结果表明,我们的方法在Ohsumed、R8、MR、20NG和R52数据集上分别可以达到71.48%、98.45%、80.32%、90.84%和95.67%的准确率,优于现有的九种文本分类方法。