Zhang Tong, Liu Guangbu, Cui Zhen, Liu Wei, Zheng Wenming, Yang Jian
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8619-8635. doi: 10.1109/TPAMI.2024.3409772. Epub 2024 Nov 6.
Mining discriminative graph topological information plays an important role in promoting graph representation ability. However, it suffers from two main issues: (1) the difficulty/complexity of computing global inter-class/intra-class scatters, commonly related to mean and covariance of graph samples, for discriminant learning; (2) the huge complexity and variety of graph topological structure that is rather challenging to robustly characterize. In this paper, we propose the Wasserstein Discriminant Dictionary Learning (WDDL) framework to achieve discriminant learning on graphs with robust graph topology modeling, and hence facilitate graph-based pattern analysis tasks. Considering the difficulty of calculating global inter-class/intra-class scatters, a reference set of graphs (aka graph dictionary) is first constructed by generating representative graph samples (aka graph keys) with expressive topological structure. Then, a Wasserstein Graph Representation (WGR) process is proposed to project input graphs into a succinct dictionary space through the graph dictionary lookup. To further achieve discriminant graph learning, a Wasserstein discriminant loss (WD-loss) is defined on the graph dictionary, in which the graph keys are optimizable, to make the intra-class keys more compact and inter-class keys more dispersed. Hence, the calculation of global Wasserstein metric (W-metric) centers can be bypassed. For sophisticated topology mining in the WGR process, a joint-Wasserstein graph embedding module is constructed to model both between-node and between-edge relationships across inputs and graph keys by encapsulating both the Wasserstein metric (between cross-graph nodes) and proposed novel Kron-Gromov-Wasserstein (KGW) metric (between cross-graph adjacencies). Specifically, the KGW-metric comprehensively characterizes the cross-graph connection patterns with the Kronecker operation, then adaptively captures those salient patterns through connection pooling. To evaluate the proposed framework, we study two graph-based pattern analysis problems, i.e. graph classification and cross-modal retrieval, with the graph dictionary flexibly adjusted to cater to these two tasks. Extensive experiments are conducted to comprehensively compare with existing advanced methods, as well as dissect the critical component of our proposed architecture. The experimental results validate the effectiveness of the WDDL framework.
挖掘判别性图拓扑信息在提升图表示能力方面发挥着重要作用。然而,它存在两个主要问题:(1)计算全局类间/类内散度的难度/复杂性,这通常与图样本的均值和协方差相关,用于判别学习;(2)图拓扑结构的巨大复杂性和多样性,要对其进行稳健表征颇具挑战。在本文中,我们提出了瓦瑟斯坦判别字典学习(WDDL)框架,以通过稳健的图拓扑建模在图上实现判别学习,从而促进基于图的模式分析任务。考虑到计算全局类间/类内散度的困难,首先通过生成具有表达性拓扑结构的代表性图样本(即图键)来构建一组图参考集(即图字典)。然后,提出了一个瓦瑟斯坦图表示(WGR)过程,通过图字典查找将输入图投影到一个简洁的字典空间中。为了进一步实现判别性图学习,在图字典上定义了一个瓦瑟斯坦判别损失(WD - 损失),其中图键是可优化的,以使类内键更紧凑,类间键更分散。因此,可以绕过全局瓦瑟斯坦度量(W - 度量)中心的计算。对于WGR过程中的复杂拓扑挖掘,构建了一个联合瓦瑟斯坦图嵌入模块,通过封装瓦瑟斯坦度量(跨图节点之间)和提出的新颖的克罗内克 - 格罗莫夫 - 瓦瑟斯坦(KGW)度量(跨图邻接之间)来对输入图和图键之间的节点间和边间关系进行建模。具体而言,KGW - 度量通过克罗内克运算全面表征跨图连接模式,然后通过连接池自适应地捕获那些显著模式。为了评估所提出的框架,我们研究了两个基于图的模式分析问题,即图分类和跨模态检索,并灵活调整图字典以适应这两项任务。进行了广泛的实验,与现有的先进方法进行全面比较,并剖析了我们提出的架构的关键组件。实验结果验证了WDDL框架的有效性。