Zhang Chuyi, Zhang Zhen, Zhang Feng, Zeng Bin, Liu Xin, Wang Lei
Big Data Innovation and Entrepreneurship Education Center of Hunan Province, Changsha University, Changsha, China.
Front Microbiol. 2024 Jul 29;15:1435408. doi: 10.3389/fmicb.2024.1435408. eCollection 2024.
Accumulating evidence shows that human health and disease are closely related to the microbes in the human body.
In this manuscript, a new computational model based on graph attention networks and sparse autoencoders, called GCANCAE, was proposed for inferring possible microbe-disease associations. In GCANCAE, we first constructed a heterogeneous network by combining known microbe-disease relationships, disease similarity, and microbial similarity. Then, we adopted the improved GCN and the CSAE to extract neighbor relations in the adjacency matrix and novel feature representations in heterogeneous networks. After that, in order to estimate the likelihood of a potential microbe associated with a disease, we integrated these two types of representations to create unique eigenmatrices for diseases and microbes, respectively, and obtained predicted scores for potential microbe-disease associations by calculating the inner product of these two types of eigenmatrices.
Based on the baseline databases such as the HMDAD and the Disbiome, intensive experiments were conducted to evaluate the prediction ability of GCANCAE, and the experimental results demonstrated that GCANCAE achieved better performance than state-of-the-art competitive methods under the frameworks of both 2-fold and 5-fold CV. Furthermore, case studies of three categories of common diseases, such as asthma, irritable bowel syndrome (IBS), and type 2 diabetes (T2D), confirmed the efficiency of GCANCAE.
越来越多的证据表明,人类健康和疾病与人体中的微生物密切相关。
在本论文中,提出了一种基于图注意力网络和稀疏自编码器的新计算模型,称为GCANCAE,用于推断可能的微生物-疾病关联。在GCANCAE中,我们首先通过结合已知的微生物-疾病关系、疾病相似性和微生物相似性构建了一个异构网络。然后,我们采用改进的GCN和CSAE来提取邻接矩阵中的邻居关系和异构网络中的新颖特征表示。之后,为了估计潜在微生物与疾病关联的可能性,我们整合这两种表示分别为疾病和微生物创建独特的特征矩阵,并通过计算这两种特征矩阵的内积获得潜在微生物-疾病关联的预测分数。
基于HMDAD和Disbiome等基线数据库,进行了密集实验以评估GCANCAE的预测能力,实验结果表明,在2折和5折交叉验证框架下,GCANCAE比现有最先进的竞争方法具有更好的性能。此外,对哮喘、肠易激综合征(IBS)和2型糖尿病(T2D)这三类常见疾病的案例研究证实了GCANCAE的有效性。