Suppr超能文献

RB-GAT:一种基于带有图注意力网络的RoBERTa-双向门控循环单元的文本分类模型。

RB-GAT: A Text Classification Model Based on RoBERTa-BiGRU with Graph ATtention Network.

作者信息

Lv Shaoqing, Dong Jungang, Wang Chichi, Wang Xuanhong, Bao Zhiqiang

机构信息

School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.

Shaanxi Key Laboratory of Information Communication Network and Security, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.

出版信息

Sensors (Basel). 2024 May 24;24(11):3365. doi: 10.3390/s24113365.

Abstract

With the development of deep learning, several graph neural network (GNN)-based approaches have been utilized for text classification. However, GNNs encounter challenges when capturing contextual text information within a document sequence. To address this, a novel text classification model, RB-GAT, is proposed by combining RoBERTa-BiGRU embedding and a multi-head Graph ATtention Network (GAT). First, the pre-trained RoBERTa model is exploited to learn word and text embeddings in different contexts. Second, the Bidirectional Gated Recurrent Unit (BiGRU) is employed to capture long-term dependencies and bidirectional sentence information from the text context. Next, the multi-head graph attention network is applied to analyze this information, which serves as a node feature for the document. Finally, the classification results are generated through a Softmax layer. Experimental results on five benchmark datasets demonstrate that our method can achieve an accuracy of 71.48%, 98.45%, 80.32%, 90.84%, and 95.67% on Ohsumed, R8, MR, 20NG and R52, respectively, which is superior to the existing nine text classification approaches.

摘要

随着深度学习的发展,几种基于图神经网络(GNN)的方法已被用于文本分类。然而,GNN在捕获文档序列中的上下文文本信息时遇到了挑战。为了解决这个问题,通过结合RoBERTa-BiGRU嵌入和多头图注意力网络(GAT),提出了一种新颖的文本分类模型RB-GAT。首先,利用预训练的RoBERTa模型在不同上下文中学习单词和文本嵌入。其次,采用双向门控循环单元(BiGRU)从文本上下文中捕获长期依赖关系和双向句子信息。接下来,应用多头图注意力网络来分析此信息,将其作为文档的节点特征。最后,通过Softmax层生成分类结果。在五个基准数据集上的实验结果表明,我们的方法在Ohsumed、R8、MR、20NG和R52数据集上分别可以达到71.48%、98.45%、80.32%、90.84%和95.67%的准确率,优于现有的九种文本分类方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57cf/11175149/73aab20d49c7/sensors-24-03365-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验