Li Haoyang, Wang Xin, Zhang Ziwei, Zhu Wenwu
IEEE Trans Pattern Anal Mach Intell. 2025 Nov;47(11):10490-10512. doi: 10.1109/TPAMI.2025.3593897.
Graph machine learning has been extensively studied in both academia and industry. Although booming with a vast number of emerging methods and techniques, most of the literature is built on the in-distribution hypothesis, i.e., testing and training graph data are identically distributed. However, this in-distribution hypothesis can hardly be satisfied in many real-world graph scenarios where the model performance substantially degrades when there exist distribution shifts between testing and training graph data. To solve this critical problem, out-of-distribution (OOD) generalization on graphs, which goes beyond the in-distribution hypothesis, has made great progress and attracted ever-increasing attention from the research community. In this paper, we comprehensively survey OOD generalization on graphs and present a detailed review of recent advances in this area. First, we provide a formal problem definition of OOD generalization on graphs. Second, we categorize existing methods into three classes from conceptually different perspectives, i.e., data, model, and learning strategy, based on their positions in the graph machine learning pipeline, followed by detailed discussions for each category. We also review the theories related to OOD generalization on graphs and introduce the commonly used graph datasets for thorough evaluations. Finally, we share our insights on future research directions.
图机器学习在学术界和工业界都得到了广泛研究。尽管随着大量新兴方法和技术的涌现而蓬勃发展,但大多数文献都是建立在分布内假设之上的,即测试图数据和训练图数据具有相同的分布。然而,在许多实际的图场景中,这种分布内假设很难得到满足,在这些场景中,当测试图数据和训练图数据之间存在分布偏移时,模型性能会大幅下降。为了解决这个关键问题,超越分布内假设的图的分布外(OOD)泛化已经取得了很大进展,并引起了研究界越来越多的关注。在本文中,我们全面综述了图的OOD泛化,并对该领域的最新进展进行了详细回顾。首先,我们给出了图的OOD泛化的正式问题定义。其次,我们根据现有方法在图机器学习管道中的位置,从概念上不同的角度将其分为三类,即数据、模型和学习策略,然后对每一类进行详细讨论。我们还回顾了与图的OOD泛化相关的理论,并介绍了用于全面评估的常用图数据集。最后,我们分享了对未来研究方向的见解。