School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.
Department of Computer Science, Shantou University, Shantou 515063, China.
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae025.
The human microbiome may impact the effectiveness of drugs by modulating their activities and toxicities. Predicting candidate microbes for drugs can facilitate the exploration of the therapeutic effects of drugs. Most recent methods concentrate on constructing of the prediction models based on graph reasoning. They fail to sufficiently exploit the topology and position information, the heterogeneity of multiple types of nodes and connections, and the long-distance correlations among nodes in microbe-drug heterogeneous graph.
We propose a new microbe-drug association prediction model, NGMDA, to encode the position and topological features of microbe (drug) nodes, and fuse the different types of features from neighbors and the whole heterogeneous graph. First, we formulate the position and topology features of microbe (drug) nodes by t-step random walks, and the features reveal the topological neighborhoods at multiple scales and the position of each node. Second, as the features of nodes are high-dimensional and sparse, we designed an embedding enhancement strategy based on supervised fully connected autoencoders to form the embeddings with representative features and the more discriminative node distributions. Third, we propose an adaptive neighbor feature fusion module, which fuses features of neighbors by the constructed position- and topology-sensitive heterogeneous graph neural networks. A novel self-attention mechanism is developed to estimate the importance of the position and topology of each neighbor to a target node. Finally, a heterogeneous graph feature fusion module is constructed to learn the long-distance correlations among the nodes in the whole heterogeneous graph by a relationship-aware graph transformer. Relationship-aware graph transformer contains the strategy for encoding the connection relationship types among the nodes, which is helpful for integrating the diverse semantics of these connections. The extensive comparison experimental results demonstrate NGMDA's superior performance over five state-of-the-art prediction methods. The ablation experiment shows the contributions of the multi-scale topology and position feature learning, the embedding enhancement strategy, the neighbor feature fusion, and the heterogeneous graph feature fusion. Case studies over three drugs further indicate that NGMDA has ability in discovering the potential drug-related microbes.
Source codes and Supplementary Material are available at https://github.com/pingxuan-hlju/NGMDA.
人类微生物组可能通过调节药物的活性和毒性来影响药物的疗效。预测候选微生物药物可以促进药物治疗效果的探索。最近的方法主要集中在基于图推理构建预测模型上。它们未能充分利用微生物-药物异质图中节点的拓扑和位置信息、多种类型节点和连接的异质性以及节点之间的长程相关性。
我们提出了一种新的微生物-药物关联预测模型 NGMDA,用于编码微生物(药物)节点的位置和拓扑特征,并融合来自邻居和整个异质图的不同类型的特征。首先,我们通过 t 步随机游走来构建微生物(药物)节点的位置和拓扑特征,这些特征揭示了多个尺度的拓扑邻居和每个节点的位置。其次,由于节点的特征是高维稀疏的,我们设计了一种基于监督全连接自动编码器的嵌入增强策略,以形成具有代表性特征和更具判别性节点分布的嵌入。第三,我们提出了一种自适应邻居特征融合模块,通过构建的位置和拓扑敏感异质图神经网络融合邻居的特征。开发了一种新颖的自注意力机制来估计每个邻居对目标节点的位置和拓扑的重要性。最后,构建了一个异质图特征融合模块,通过关系感知图转换器学习整个异质图中节点之间的长程相关性。关系感知图转换器包含节点之间连接关系类型的编码策略,这有助于整合这些连接的多种语义。广泛的比较实验结果表明,NGMDA 优于五种最先进的预测方法。消融实验表明了多尺度拓扑和位置特征学习、嵌入增强策略、邻居特征融合和异质图特征融合的贡献。对三种药物的案例研究进一步表明,NGMDA 具有发现潜在药物相关微生物的能力。
源代码和补充材料可在 https://github.com/pingxuan-hlju/NGMDA 上获得。