Yu Zhizhi, Jin Di, Wei Jianguo, Li Yawen, Liu Ziyang, Shang Yue, Han Jiawei, Wu Lingfei
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14699-14711. doi: 10.1109/TNNLS.2023.3281354. Epub 2024 Oct 7.
Graph neural networks (GNNs) have gained great prevalence in tackling various analytical tasks on graph-structured data (i.e., networks). Typical GNNs and their variants adopt a message-passing principle that obtains network representations by the attribute propagates along network topology, which however ignores the rich textual semantics (e.g., local word-sequence) that exist in numerous real-world networks. Existing methods for text-rich networks integrate textual semantics by mainly using internal information such as topics or phrases/words, which often suffer from an inability to comprehensively mine the textual semantics, limiting the reciprocal guidance between network structure and textual semantics. To address these problems, we present a novel text-rich GNN with external knowledge (TeKo), in order to make full use of both structural and textual information within text-rich networks. Specifically, we first present a flexible heterogeneous semantic network that integrates high-quality entities as well as interactions among documents and entities. We then introduce two types of external knowledge, that is, structured triplets and unstructured entity descriptions, to gain a deeper insight into textual semantics. Furthermore, we devise a reciprocal convolutional mechanism for the constructed heterogeneous semantic network, enabling network structure and textual semantics to collaboratively enhance each other and learn high-level network representations. Extensive experiments illustrate that TeKo achieves state-of-the-art performance on a variety of text-rich networks as well as a large-scale e-commerce searching dataset.
图神经网络(GNN)在处理图结构数据(即网络)上的各种分析任务方面已变得非常流行。典型的GNN及其变体采用消息传递原则,通过属性沿网络拓扑传播来获得网络表示,然而这忽略了许多现实世界网络中存在的丰富文本语义(例如,局部单词序列)。现有的针对富含文本的网络的方法主要通过使用诸如主题或短语/单词等内部信息来整合文本语义,这往往无法全面挖掘文本语义,限制了网络结构与文本语义之间的相互指导。为了解决这些问题,我们提出了一种具有外部知识的新型富含文本的GNN(TeKo),以便充分利用富含文本的网络中的结构和文本信息。具体而言,我们首先提出一种灵活的异构语义网络,该网络整合了高质量实体以及文档与实体之间的交互。然后,我们引入两种类型的外部知识,即结构化三元组和非结构化实体描述,以更深入地了解文本语义。此外,我们为构建的异构语义网络设计了一种相互卷积机制,使网络结构和文本语义能够协同增强彼此并学习高级网络表示。大量实验表明,TeKo在各种富含文本的网络以及大规模电子商务搜索数据集上都取得了领先的性能。