Zhang Xikun, Song Dongjin, Tao Dacheng
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17398-17410. doi: 10.1109/TNNLS.2023.3303454. Epub 2024 Dec 2.
Memory replay, which stores a subset of historical data from previous tasks to replay while learning new tasks, exhibits state-of-the-art performance for various continual learning applications on the Euclidean data. While topological information plays a critical role in characterizing graph data, existing memory replay-based graph learning techniques only store individual nodes for replay and do not consider their associated edge information. To this end, based on the message-passing mechanism in graph neural networks (GNNs), we present the Ricci curvature-based graph sparsification technique to perform continual graph representation learning. Specifically, we first develop the subgraph episodic memory (SEM) to store the topological information in the form of computation subgraphs. Next, we sparsify the subgraphs such that they only contain the most informative structures (nodes and edges). The informativeness is evaluated with the Ricci curvature, a theoretically justified metric to estimate the contribution of neighbors to represent a target node. In this way, we can reduce the memory consumption of a computation subgraph from to and enable GNNs to fully utilize the most informative topological information for memory replay. Besides, to ensure the applicability on large graphs, we also provide the theoretically justified surrogate for the Ricci curvature in the sparsification process, which can greatly facilitate the computation. Finally, our empirical studies show that SEM outperforms state-of-the-art approaches significantly on four different public datasets. Unlike existing methods, which mainly focus on task incremental learning (task-IL) setting, SEM also succeeds in the challenging class incremental learning (class-IL) setting in which the model is required to distinguish all learned classes without task indicators and even achieves comparable performance to joint training, which is the performance upper bound for continual learning.
记忆重放会存储来自先前任务的一部分历史数据,以便在学习新任务时进行重放,在欧几里得数据上的各种持续学习应用中展现出了先进的性能。虽然拓扑信息在表征图数据方面起着关键作用,但现有的基于记忆重放的图学习技术仅存储单个节点以供重放,而没有考虑它们相关的边信息。为此,基于图神经网络(GNN)中的消息传递机制,我们提出了基于里奇曲率的图稀疏化技术来进行持续的图表示学习。具体来说,我们首先开发了子图情节记忆(SEM),以计算子图的形式存储拓扑信息。接下来,我们对这些子图进行稀疏化处理,使其仅包含最具信息性的结构(节点和边)。信息性是用里奇曲率来评估的,里奇曲率是一种从理论上证明合理的度量,用于估计邻居对表示目标节点的贡献。通过这种方式,我们可以将计算子图的内存消耗从[原文未提及具体数值]减少到[原文未提及具体数值],并使GNN能够充分利用最具信息性的拓扑信息进行记忆重放。此外,为确保在大图上的适用性,我们还在稀疏化过程中为里奇曲率提供了理论上合理的替代方法,这可以极大地促进计算。最后,我们的实证研究表明,在四个不同的公共数据集上,SEM显著优于现有方法。与现有主要关注任务增量学习(task-IL)设置的方法不同,SEM在具有挑战性的类增量学习(class-IL)设置中也取得了成功,在该设置中,模型需要在没有任务指标的情况下区分所有已学习的类,甚至实现了与联合训练相当的性能,而联合训练是持续学习的性能上限。