State Key Laboratory of Plant Genomics, Institute of Genetic and Developmental Biology, Chinese Academy of Sciences, No. 1 West Beichen Road, Chaoyang District, Beijing, 100101, China.
Key Lab of Agricultural Biotechnology of Ningxia, Agricultural Biotechnology Center, Ningxia Academy of Agriculture and Forestry Sciences, 590 Huanghe East Road, Jinfeng District, Yinchuan, Ningxia, 750002, China.
Sci Rep. 2017 Mar 20;7:43792. doi: 10.1038/srep43792.
Identification of the associations between microRNA molecules and human diseases from large-scale heterogeneous biological data is an important step for understanding the pathogenesis of diseases in microRNA level. However, experimental verification of microRNA-disease associations is expensive and time-consuming. To overcome the drawbacks of conventional experimental methods, we presented a combinatorial prioritization algorithm to predict the microRNA-disease associations. Importantly, our method can be used to predict microRNAs (diseases) associated with the diseases (microRNAs) without the known associated microRNAs (diseases). The predictive performance of our proposed approach was evaluated and verified by the internal cross-validations and external independent validations based on standard association datasets. The results demonstrate that our proposed method achieves the impressive performance for predicting the microRNA-disease association with the Area Under receiver operation characteristic Curve (AUC), 86.93%, which is indeed outperform the previous prediction methods. Particularly, we observed that the ensemble-based method by integrating the predictions of multiple algorithms can give more reliable and robust prediction than the single algorithm, with the AUC score improved to 92.26%. We applied our combinatorial prioritization algorithm to lung neoplasms and breast neoplasms, and revealed their top 30 microRNA candidates, which are in consistent with the published literatures and databases.
从大规模异质生物数据中鉴定与人类疾病相关的 microRNA 分子是在 microRNA 水平上理解疾病发病机制的重要步骤。然而,microRNA 疾病关联的实验验证既昂贵又耗时。为了克服传统实验方法的缺点,我们提出了一种组合优先级算法来预测 microRNA 疾病关联。重要的是,我们的方法可用于预测与已知相关 microRNA(疾病)无关联的 microRNAs(疾病)。我们提出的方法的预测性能通过基于标准关联数据集的内部交叉验证和外部独立验证进行了评估和验证。结果表明,我们提出的方法在预测 microRNA-疾病关联方面具有令人印象深刻的性能,接收者操作特征曲线(AUC)下的面积为 86.93%,确实优于以前的预测方法。特别是,我们观察到通过整合多个算法的预测的基于集成的方法比单个算法能提供更可靠和稳健的预测,AUC 得分提高到 92.26%。我们将组合优先级算法应用于肺肿瘤和乳腺肿瘤,并揭示了它们的前 30 个 microRNA 候选物,这与已发表的文献和数据库一致。