School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, Gansu, PR China.
Comput Biol Chem. 2024 Feb;108:107989. doi: 10.1016/j.compbiolchem.2023.107989. Epub 2023 Nov 22.
Researchers have been creating an expanding corpus of experimental evidences in biomedical field which has revealed prevalent associations between circRNAs and human diseases. Such linkages unveiled afforded a new perspective for elucidating etiology and devise innovative therapeutic strategies. In recent years, many computational methods were introduced to remedy the limitations of inefficiency and exorbitant budgets brought by conventional lab-experimental approaches to enumerate possible circRNA-disease associations, but the majority of existing methods still face challenges in effectively integrating node embeddings with higher-order neighborhood representations, which might hinder the final predictive accuracy from attaining optimal measures. To overcome such constraints, we proposed AMPCDA, a computational technique harnessing predefined metapaths to predict circRNA-disease associations. Specifically, an association graph is initially built upon three source databases and two similarity derivation procedures, and DeepWalk is subsequently imposed on the graph to procure initial feature representations. Vectorial embeddings of metapath instances, concatenated by initial node features, are then fed through a customed encoder. By employing self-attention section, metapath-specific contributions to each node are accumulated before combining with node's intrinsic features and channeling into a graph attention module, which furnished the input representations for the multilayer perceptron to predict the ultimate association probability scores. By integrating graph topology features and node embedding themselves, AMPCDA managed to effectively leverage information carried by multiple nodes along paths and exhibited an exceptional predictive performance, achieving AUC values of 0.9623, 0.9675, and 0.9711 under 5-fold cross validation, 10-fold cross validation, and leave-one-out cross validation, respectively. These results signify substantial accuracy improvements compared to other prediction models. Case study assessments confirm the high predictive accuracy of our proposed technique in identifying circRNA-disease connections, highlighting its value in guiding future biological research to reveal new disease mechanisms.
研究人员在生物医学领域创建了一个不断扩展的实验证据库,揭示了 circRNAs 与人类疾病之间普遍存在的关联。这些关联的揭示为阐明病因和设计创新治疗策略提供了新的视角。近年来,许多计算方法被引入,以弥补传统实验室实验方法在枚举可能的 circRNA-疾病关联方面效率低下和预算过高的局限性,但现有的大多数方法在有效地将节点嵌入与更高阶的邻域表示集成方面仍然面临挑战,这可能会阻碍最终预测精度达到最佳水平。为了克服这些限制,我们提出了 AMPCDA,这是一种利用预定义元路径来预测 circRNA-疾病关联的计算技术。具体来说,首先基于三个源数据库和两个相似性推导过程构建关联图,然后在图上强制实施 DeepWalk 以获取初始特征表示。然后将元路径实例的向量嵌入,通过初始节点特征连接,输入到一个定制的编码器中。通过使用自注意力部分,为每个节点累积元路径特定的贡献,然后与节点的内在特征相结合,并将其传入图注意力模块,为多层感知机提供输入表示,以预测最终的关联概率得分。通过整合图拓扑特征和节点嵌入本身,AMPCDA 成功地利用了路径上多个节点携带的信息,并表现出出色的预测性能,在 5 折交叉验证、10 折交叉验证和留一交叉验证下,AUC 值分别达到 0.9623、0.9675 和 0.9711。与其他预测模型相比,这些结果表明准确性有了显著提高。案例研究评估证实了我们提出的技术在识别 circRNA-疾病关联方面的高预测准确性,突出了其在指导未来生物学研究以揭示新的疾病机制方面的价值。