a School of Computer Science , China University of Geosciences , Wuhan , China.
b Department of Hematology , The Affiliated Huai'an Hospital of Xuzhou Medical University , Huai'an , China.
RNA Biol. 2019 May;16(5):601-611. doi: 10.1080/15476286.2019.1570811. Epub 2019 Feb 20.
Since lots of miRNA-disease associations have been verified, it is meaningful to discover more miRNA-disease associations for serving disease diagnosis and prevention of human complex diseases. However, it is not practical to identify potential associations using traditional biological experimental methods since the process is expensive and time consuming. Therefore, it is necessary to develop efficient computational methods to accomplish this task. In this work, we introduced a matrix completion model with dual Laplacian regularization (DLRMC) to infer unknown miRNA-disease associations in heterogeneous omics data. Specifically, DLRMC transformed the task of miRNA-disease association prediction into a matrix completion problem, in which the potential missing entries of the miRNA-disease association matrix were calculated, the missing association can be obtained based on the prediction scores after the completion procedure. Meanwhile, the miRNA functional similarity and the disease semantic similarity were fully exploited to serve the miRNA-disease association matrix completion by using a dual Laplacian regularization term. In the experiments, we conducted global and local Leave-One-Out Cross Validation (LOOCV) and case studies to evaluate the efficacy of DLRMC on the Human miRNA-disease associations dataset obtained from the HMDDv2.0 database. As a result, the AUCs of DLRMC is 0.9174 and 0.8289 in global LOOCV and local LOOCV, respectively, which significantly outperform a variety of previous methods. In addition, in the case studies on four significant diseases related to human health including Colon Neoplasms, Kidney neoplasms, Lymphoma and Prostate neoplasms, 90%, 92%, 92% and 94% out of the top 50 predicted miRNAs has been confirmed, respectively.
由于已经验证了许多 miRNA 与疾病的关联,因此发现更多的 miRNA 与疾病的关联对于服务于人类复杂疾病的诊断和预防是有意义的。然而,使用传统的生物实验方法来识别潜在的关联是不切实际的,因为这个过程既昂贵又耗时。因此,有必要开发有效的计算方法来完成这项任务。在这项工作中,我们引入了一种具有双拉普拉斯正则化(DLRMC)的矩阵补全模型,用于推断异构组学数据中未知的 miRNA 与疾病的关联。具体来说,DLRMC 将 miRNA 与疾病关联预测的任务转化为一个矩阵补全问题,其中计算了 miRNA 与疾病关联矩阵的潜在缺失项,在补全过程之后,可以根据预测得分获得缺失的关联。同时,充分利用 miRNA 功能相似性和疾病语义相似性,通过双拉普拉斯正则化项为 miRNA 与疾病关联矩阵补全服务。在实验中,我们进行了全局和局部留一交叉验证(LOOCV)以及案例研究,以评估 DLRMC 在从 HMDDv2.0 数据库获得的人类 miRNA 与疾病关联数据集上的效果。结果表明,DLRMC 在全局 LOOCV 和局部 LOOCV 中的 AUC 分别为 0.9174 和 0.8289,明显优于多种先前的方法。此外,在与人类健康相关的四种重要疾病(包括结肠肿瘤、肾肿瘤、淋巴瘤和前列腺肿瘤)的案例研究中,分别有 90%、92%、92%和 94%的前 50 个预测 miRNA 得到了验证。