Wu Meiqi, Yang Yingxi, Wang Hui, Ding Jun, Zhu Huan, Xu Yan
1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China.
Curr Genomics. 2019 Dec;20(8):581-591. doi: 10.2174/1389202920666191023090215.
With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations.
In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms.
IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries.
We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.
随着生物学研究的快速发展,微小RNA(miRNA)越来越受到全球关注。越来越多的生物学研究和科学实验证明,miRNA与导致复杂人类疾病的大量关键生物学过程的发生和发展有关。因此,识别miRNA与疾病之间的关联有助于疾病诊断。尽管一些研究已经发现了miRNA与疾病之间的大量关联,但仍有许多关联有待识别。揭示miRNA与疾病关联的实验方法既耗时又昂贵。因此,迫切需要有效的计算方法来预测新的关联。
在这项工作中,我们提出了一种预测miRNA与疾病潜在关联的综合方法(IMPMD)。通过功能相似性、高斯相似性和杰卡德相似性的组合获得miRNA的增强相似性。对于疾病,通过语义相似性、高斯相似性和杰卡德相似性的组合获得。然后,我们使用这两种增强相似性来构建特征并计算累积分数以选择稳健的特征。最后,应用一般线性回归为支持向量机、K近邻和逻辑回归算法分配权重。
IMPMD在10折交叉验证中获得的AUC为0.9386,优于大多数先前的模型。为了进一步评估我们的模型,我们在肺癌和乳腺癌的两种案例研究中实施了IMPMD。前50个相关miRNA中有49个(肺癌)和50个(乳腺癌)通过实验发现得到验证。
我们构建了一个名为IMPMD的软件,可从https://github.com/Sunmile/IMPMD免费下载。