Yu Junchi, Xu Tingyang, Rong Yu, Bian Yatao, Huang Junzhou, He Ran
IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1650-1663. doi: 10.1109/TPAMI.2021.3112205. Epub 2024 Feb 6.
The emergence of Graph Convolutional Network (GCN) has greatly boosted the progress of graph learning. However, two disturbing factors, noise and redundancy in graph data, and lack of interpretation for prediction results, impede further development of GCN. One solution is to recognize a predictive yet compressed subgraph to get rid of the noise and redundancy and obtain the interpretable part of the graph. This setting of subgraph is similar to the information bottleneck (IB) principle, which is less studied on graph-structured data and GCN. Inspired by the IB principle, we propose a novel subgraph information bottleneck (SIB) framework to recognize such subgraphs, named IB-subgraph. However, the intractability of mutual information and the discrete nature of graph data makes the objective of SIB notoriously hard to optimize. To this end, we introduce a bilevel optimization scheme coupled with a mutual information estimator for irregular graphs. Moreover, we propose a continuous relaxation for subgraph selection with a connectivity loss for stabilization. We further theoretically prove the error bound of our estimation scheme for mutual information and the noise-invariant nature of IB-subgraph. Extensive experiments on graph learning and large-scale point cloud tasks demonstrate the superior property of IB-subgraph.
图卷积网络(GCN)的出现极大地推动了图学习的进展。然而,图数据中的噪声和冗余这两个干扰因素以及预测结果缺乏可解释性,阻碍了GCN的进一步发展。一种解决方案是识别一个具有预测性但经过压缩的子图,以消除噪声和冗余,并获得图的可解释部分。这种子图设置类似于信息瓶颈(IB)原则,而在图结构数据和GCN上对其研究较少。受IB原则的启发,我们提出了一种新颖的子图信息瓶颈(SIB)框架来识别此类子图,称为IB - 子图。然而,互信息的难处理性以及图数据的离散性质使得SIB的目标极难优化。为此,我们引入了一种双层优化方案,并结合用于不规则图的互信息估计器。此外,我们提出了一种用于子图选择的连续松弛方法,并带有用于稳定的连通性损失。我们进一步从理论上证明了我们的互信息估计方案的误差界以及IB - 子图的噪声不变性。在图学习和大规模点云任务上的大量实验证明了IB - 子图的优越性能。