Fu Yao, Yang Runtao, Zhang Lina, Fu Xu
IEEE J Biomed Health Inform. 2023 Oct;27(10):5177-5186. doi: 10.1109/JBHI.2023.3299042. Epub 2023 Oct 5.
Circular RNAs (circRNAs) are specifically and abnormally expressed in disease tissues, and thus can be used as biomarkers to diagnose relevant diseases. Predicting circRNA-disease associations will provide essential clues to reveal molecular mechanisms of disease development and discover novel therapeutic targets. Existing algorithms ignore the heterogeneous biological association information related to microRNAs (miRNAs). Based on a heterogeneous graph embedding model, a novel circRNA-disease association prediction method called HGECDA is developed in this paper. The heterogeneous graph network containing circRNA-miRNA-disease association information is first constructed. To sample the heterogeneous information, the meta-path-based random walk that can capture the relevance between various types of nodes is employed. Then, the path embedding model based on skip-gram and random negative sampling is built to acquire the initial feature vectors of circRNAs and diseases. Finally, the CosMulformer model with linearized self-attention and Hadamard product is designed to obtain the circRNA-disease interaction vectors and conduct the prediction task. Experimental results demonstrate the critical role of miRNA in enriching the information of the feature space, the effectiveness of the CosMulformer model in picking out deep local interaction features, and the feasibility of the Hadamard product chosen as the integration pattern in the CosMulformer model. Compared with existing state-of-the-art methods on the same dataset, HGECDA performs better than the other seven algorithms. Moreover, the case studies about breast cancer and colorectal cancer demonstrate the practical value of HGECDA in predicting potential circRNA-disease associations.
环状RNA(circRNAs)在疾病组织中特异性异常表达,因此可作为诊断相关疾病的生物标志物。预测circRNA与疾病的关联将为揭示疾病发展的分子机制和发现新的治疗靶点提供重要线索。现有算法忽略了与微小RNA(miRNAs)相关的异质生物关联信息。基于异质图嵌入模型,本文开发了一种名为HGECDA的新型circRNA与疾病关联预测方法。首先构建包含circRNA-miRNA-疾病关联信息的异质图网络。为了对异质信息进行采样,采用了基于元路径的随机游走,该方法可以捕捉各种类型节点之间的相关性。然后,构建基于跳字模型和随机负采样的路径嵌入模型,以获取circRNAs和疾病的初始特征向量。最后,设计具有线性化自注意力和哈达玛积的CosMulformer模型,以获得circRNA与疾病的相互作用向量并执行预测任务。实验结果证明了miRNA在丰富特征空间信息方面的关键作用、CosMulformer模型在提取深度局部相互作用特征方面 的有效性,以及哈达玛积作为CosMulformer模型中的整合模式的可行性。在同一数据集上与现有的最先进方法相比,HGECDA的性能优于其他七种算法。此外,关于乳腺癌和结直肠癌的案例研究证明了HGECDA在预测潜在circRNA与疾病关联方面的实用价值。