Yi Hai-Cheng, You Zhu-Hong, Guo Zhen-Hao
Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China.
University of Chinese Academy of Sciences, Beijing, China.
Front Genet. 2019 Nov 7;10:1106. doi: 10.3389/fgene.2019.01106. eCollection 2019.
A key aim of post-genomic biomedical research is to systematically understand and model complex biomolecular activities based on a systematic perspective. Biomolecular interactions are widespread and interrelated, multiple biomolecules coordinate to sustain life activities, any disturbance of these complex connections can lead to abnormal of life activities or complex diseases. However, many existing researches usually only focus on individual intermolecular interactions. In this work, we revealed, constructed, and analyzed a large-scale molecular association network of multiple biomolecules in human by integrating associations among lncRNAs, miRNAs, proteins, drugs, and diseases, in which various associations are interconnected and any type of associations can be predicted. We propose Molecular Association Network (MAN)-High-Order Proximity preserved Embedding (HOPE), a novel network representation learning based method to fully exploit latent feature of biomolecules to accurately predict associations between molecules. More specifically, network representation learning algorithm HOPE was applied to learn behavior feature of nodes in the association network. Attribute features of nodes were also adopted. Then, a machine learning model CatBoost was trained to predict potential association between any nodes. The performance of our method was evaluated under five-fold cross validation. A case study to predict miRNA-disease associations was also conducted to verify the prediction capability. MAN-HOPE achieves high accuracy of 93.3% and area under the receiver operating characteristic curve of 0.9793. The experimental results demonstrate the novelty of our systematic understanding of the intermolecular associations, and enable systematic exploration of the landscape of molecular interactions that shape specialized cellular functions.
后基因组生物医学研究的一个关键目标是从系统的角度系统地理解和模拟复杂的生物分子活动。生物分子相互作用广泛且相互关联,多种生物分子协同维持生命活动,这些复杂联系的任何干扰都可能导致生命活动异常或引发复杂疾病。然而,许多现有研究通常仅关注单个分子间的相互作用。在这项工作中,我们通过整合长链非编码RNA(lncRNAs)、微小RNA(miRNAs)、蛋白质、药物和疾病之间的关联,揭示、构建并分析了人类中多个生物分子的大规模分子关联网络,其中各种关联相互连接,并且可以预测任何类型的关联。我们提出了分子关联网络(MAN)-高阶邻近保留嵌入(HOPE),这是一种基于网络表示学习的新方法,用于充分利用生物分子的潜在特征来准确预测分子之间的关联。更具体地说,应用网络表示学习算法HOPE来学习关联网络中节点的行为特征。还采用了节点的属性特征。然后,训练一个机器学习模型CatBoost来预测任何节点之间的潜在关联。在五折交叉验证下评估了我们方法的性能。还进行了一个预测miRNA-疾病关联的案例研究以验证预测能力。MAN-HOPE实现了93.3%的高精度和0.9793的受试者工作特征曲线下面积。实验结果证明了我们对分子间关联进行系统理解的新颖性,并能够系统地探索塑造特定细胞功能的分子相互作用景观。