Zhu Rong, Wang Yong, Shang Junliang, Dai Ling-Yun, Li Feng
School of Computer Science, Qufu Normal University, Rizhao, Shandong, China.
Laboratory Experimental Teaching and Equipment Management Center, Qufu Normal University, Rizhao, Shandong, China.
PeerJ Comput Sci. 2025 Aug 15;11:e3098. doi: 10.7717/peerj-cs.3098. eCollection 2025.
Microorganisms play an important role in many complex diseases, influencing their onset, progression, and potential treatment outcomes. Exploring the associations between microbes and human diseases can deepen our understanding of disease mechanisms and assist in improving diagnosis and therapy. However, traditional biological experiments used to uncover such relationships often demand substantial time and resources. In response to these limitations, computational methods have gained traction as more practical tools for predicting microbe-disease associations. Despite their growing use, many of these models still face challenges in terms of accuracy, stability, and adaptability to noisy or sparse data. To overcome the aforementioned limitations, we propose a novel predictive framework, HyperGraph Neural Network with Transformer for Microbe-Disease Associations (HGNNTMDA), designed to infer potential associations between human microbes and diseases. The framework begins by integrating microbe-disease association data with similarity-based features to construct node representations. Two graph construction strategies are employed: a K-nearest neighbor (KNN)-based adjacency matrix to build a standard graph, and a K-means clustering approach that groups similar nodes into clusters, which serve as hyperedges to define the incidence matrix of a hypergraph. Separate hypergraph neural networks (HGNNs) are then applied to microbe and disease graphs to extract structured node-level features. An attention mechanism (AM) is subsequently introduced to emphasize informative signals, followed by a Transformer module to capture contextual dependencies and enhance global feature representation. A fully connected layer then projects these features into a unified space, where association scores between microbes and diseases are computed. For model optimization, we propose a hybrid loss strategy combining contrastive loss and Huber loss. The contrastive loss aids in learning discriminative embeddings, while the Huber loss enhances robustness against outliers and improves predictive stability. The effectiveness of HGNNTMDA is validated on two benchmark datasets-HMDAD and Disbiome-using five-fold cross-validation (5CV). Our model achieves an AUC of 0.9976 on HMDAD and 0.9423 on Disbiome, outperforming six existing state-of-the-art methods. Further case studies confirm its practical value in discovering novel microbe-disease associations.
微生物在许多复杂疾病中发挥着重要作用,影响疾病的发生、发展以及潜在的治疗结果。探索微生物与人类疾病之间的关联能够加深我们对疾病机制的理解,并有助于改进诊断和治疗方法。然而,用于揭示此类关系的传统生物学实验通常需要大量的时间和资源。针对这些局限性,计算方法作为预测微生物-疾病关联的更实用工具而受到关注。尽管它们的使用越来越广泛,但许多此类模型在准确性、稳定性以及对噪声或稀疏数据的适应性方面仍然面临挑战。为了克服上述局限性,我们提出了一种新颖的预测框架——用于微生物-疾病关联的带Transformer的超图神经网络(HGNNTMDA),旨在推断人类微生物与疾病之间的潜在关联。该框架首先将微生物-疾病关联数据与基于相似性的特征相结合,以构建节点表示。采用了两种图构建策略:基于K近邻(KNN)的邻接矩阵来构建标准图,以及K均值聚类方法将相似节点分组为簇,这些簇用作超边来定义超图的关联矩阵。然后将单独的超图神经网络(HGNN)应用于微生物图和疾病图,以提取结构化的节点级特征。随后引入注意力机制(AM)以强调信息性信号,接着是Transformer模块来捕获上下文依赖性并增强全局特征表示。一个全连接层然后将这些特征投影到一个统一的空间中,在该空间中计算微生物与疾病之间的关联分数。对于模型优化,我们提出了一种结合对比损失和Huber损失的混合损失策略。对比损失有助于学习判别性嵌入,而Huber损失增强了对异常值的鲁棒性并提高了预测稳定性。通过使用五折交叉验证(5CV)在两个基准数据集——HMDAD和Disbiome上验证了HGNNTMDA的有效性。我们的模型在HMDAD上的AUC为0.9976,在Disbiome上为0.9423,优于六种现有的先进方法。进一步的案例研究证实了其在发现新型微生物-疾病关联方面的实用价值。