Zhao Bo-Wei, Su Xiao-Rui, Yang Yue, Li Dong-Xu, Li Guo-Dong, Hu Peng-Wei, Luo Xin, Hu Lun
College of Computer and Information Science, School of Software, Southwest University, Chongqing 400715, China.
The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
Comput Struct Biotechnol J. 2024 Jul 6;23:2924-2933. doi: 10.1016/j.csbj.2024.06.032. eCollection 2024 Dec.
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) are closely related to the treatment of human diseases. Traditional biological experiments often require time-consuming and labor-intensive in their search for mechanisms of disease. Computational methods are regarded as an effective way to predict unknown lncRNA-miRNA interactions (LMIs). However, most of them complete their tasks by mainly focusing on a single lncRNA-miRNA network without considering the complex mechanism between biomolecular in life activities, which are believed to be useful for improving the accuracy of LMI prediction. To address this, a heterogeneous information network (HIN) learning model with neighborhood-level structural representation, called HINLMI, to precisely identify LMIs. In particular, HINLMI first constructs a HIN by integrating nine interactions of five biomolecules. After that, different representation learning strategies are applied to learn the biological and network representations of lncRNAs and miRNAs in the HIN from different perspectives. Finally, HINLMI incorporates the XGBoost classifier to predict unknown LMIs using final embeddings of lncRNAs and miRNAs. Experimental results show that HINLMI yields a best performance on the real dataset when compared with state-of-the-art computational models. Moreover, several analysis experiments indicate that the simultaneous consideration of biological knowledge and network topology of lncRNAs and miRNAs allows HINLMI to accurately predict LMIs from a more comprehensive perspective. The promising performance of HINLMI also reveals that the utilization of rich heterogeneous information can provide an alternative insight for HINLMI to identify novel interactions between lncRNAs and miRNAs.
长链非编码RNA(lncRNAs)和微小RNA(miRNAs)与人类疾病的治疗密切相关。传统生物学实验在探寻疾病机制时往往耗时且费力。计算方法被视为预测未知lncRNA-miRNA相互作用(LMI)的有效途径。然而,它们大多主要聚焦于单个lncRNA-miRNA网络来完成任务,而未考虑生命活动中生物分子间的复杂机制,而这些机制被认为有助于提高LMI预测的准确性。为解决这一问题,提出了一种具有邻域级结构表示的异质信息网络(HIN)学习模型,称为HINLMI,用于精确识别LMI。具体而言,HINLMI首先通过整合五种生物分子的九种相互作用构建一个HIN。之后,应用不同的表示学习策略从不同角度学习HIN中lncRNAs和miRNAs的生物学和网络表示。最后,HINLMI结合XGBoost分类器,利用lncRNAs和miRNAs的最终嵌入来预测未知的LMI。实验结果表明,与现有计算模型相比,HINLMI在真实数据集上表现最佳。此外,多项分析实验表明,同时考虑lncRNAs和miRNAs的生物学知识和网络拓扑结构,使HINLMI能够从更全面的角度准确预测LMI。HINLMI的良好性能还表明,利用丰富的异质信息可为HINLMI识别lncRNAs和miRNAs之间的新型相互作用提供另一种见解。