Pham Phu
Faculty of Information Technology, HUTECH University, Ho Chi Minh City, Vietnam.
Comput Biol Chem. 2025 Jun 14;119:108548. doi: 10.1016/j.compbiolchem.2025.108548.
In recent years, recent advancements have shown the successful integration of topological data analysis (TDA) with deep learning (DL) to enhance representations of complex data structures, such as graphs. Graph neural networks (GNNs) have emerged as a powerful tool for analyzing graph-based data and have been popularly applied to multiple tasks related to node and graph analysis and classification. Given the intrinsic topological nature of graph connectivity, recent studies have leveraged topological features; including persistent homology and landmark extraction, to enrich graph representations. These topology-enhanced GNNs have demonstrated significant promise in improving performance across various graph learning problems, such as classification. One key area of innovation is topological graph pooling strategies, which aim to capture multi-scale (both local and global) topology-aware graph-level representations. However, most existing approaches rely heavily on local-neighborhood feature aggregation, making them insufficient for preserving multi-scale representations of complex-structured graphs. These limitations have hindered their ability to efficiently capture the rich topological and structural diversity of input graphs during the pooling process. To address this gap, in this paper, we propose a novel attention-enhanced, multi-scope topological graph pooling strategy, called AETP. Our proposed AETP is designed to extract both discriminative topology-structured information and graph-level variations, enabling high-expressive representation learning. Our AETP model specifically targets complex and small molecular graph representation learning and classification tasks. Comprehensive comparative studies on various real-world molecular datasets, including FDA_DILIst, T3DB_Toxin_2, Eye_Irritation and Eye_Corrosion, validate the effectiveness of our proposed AETP model, demonstrating its superior performance compared to existing graph embedding baselines, such as: GCN, GraphSAGE, GAT, GIN, GINE, UniMP, GATv2, TOGL and TopoPool.
近年来,最新进展表明拓扑数据分析(TDA)与深度学习(DL)成功整合,可增强对复杂数据结构(如图)的表示。图神经网络(GNN)已成为分析基于图的数据的强大工具,并已广泛应用于与节点和图分析及分类相关的多个任务。鉴于图连通性的内在拓扑性质,近期研究利用了拓扑特征,包括持久同调与地标提取,以丰富图的表示。这些拓扑增强的GNN在改善各种图学习问题(如分类)的性能方面已展现出巨大潜力。一个关键的创新领域是拓扑图池化策略,其旨在捕获多尺度(局部和全局)拓扑感知的图级表示。然而,大多数现有方法严重依赖局部邻域特征聚合,这使得它们在保留复杂结构图形的多尺度表示方面存在不足。这些限制阻碍了它们在池化过程中有效捕获输入图丰富的拓扑和结构多样性的能力。为弥补这一差距,在本文中,我们提出了一种新颖的注意力增强、多范围拓扑图池化策略,称为AETP。我们提出的AETP旨在提取有判别力的拓扑结构信息和图级变化,实现高表达性的表示学习。我们的AETP模型专门针对复杂和小分子图表示学习及分类任务。对各种真实世界分子数据集(包括FDA_DILIst、T3DB_Toxin_2、Eye_Irritation和Eye_Corrosion)进行的全面比较研究,验证了我们提出的AETP模型的有效性,表明其与现有图嵌入基线(如:GCN、GraphSAGE、GAT、GIN、GINE、UniMP、GATv2、TOGL和TopoPool)相比具有卓越性能。