Yang Yiping, Cui Xiaohui
Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, Wuhan University, Wuhan 430000, China.
Entropy (Basel). 2021 Nov 18;23(11):1536. doi: 10.3390/e23111536.
Text classification is a fundamental research direction, aims to assign tags to text units. Recently, graph neural networks (GNN) have exhibited some excellent properties in textual information processing. Furthermore, the pre-trained language model also realized promising effects in many tasks. However, many text processing methods cannot model a single text unit's structure or ignore the semantic features. To solve these problems and comprehensively utilize the text's structure information and semantic information, we propose a Bert-Enhanced text Graph Neural Network model (BEGNN). For each text, we construct a text graph separately according to the co-occurrence relationship of words and use GNN to extract text features. Moreover, we employ Bert to extract semantic features. The former part can take into account the structural information, and the latter can focus on modeling the semantic information. Finally, we interact and aggregate these two features of different granularity to get a more effective representation. Experiments on standard datasets demonstrate the effectiveness of BEGNN.
文本分类是一个基础的研究方向,旨在为文本单元分配标签。近年来,图神经网络(GNN)在文本信息处理方面展现出了一些优异的特性。此外,预训练语言模型在许多任务中也取得了不错的效果。然而,许多文本处理方法无法对单个文本单元的结构进行建模或忽略了语义特征。为了解决这些问题并综合利用文本的结构信息和语义信息,我们提出了一种Bert增强文本图神经网络模型(BEGNN)。对于每一篇文本,我们根据词的共现关系分别构建一个文本图,并使用GNN来提取文本特征。此外,我们使用Bert来提取语义特征。前一部分可以考虑结构信息,后一部分则专注于对语义信息进行建模。最后,我们对这两种不同粒度的特征进行交互和聚合,以获得更有效的表示。在标准数据集上的实验证明了BEGNN的有效性。