Division of Biomedical Engineering at the University of Saskatchewan.
School of Computer Science at Shaanxi Normal University.
Brief Funct Genomics. 2021 Jul 17;20(4):273-287. doi: 10.1093/bfgp/elab002.
Biomolecules, such as microRNAs, circRNAs, lncRNAs and genes, are functionally interdependent in human cells, and all play critical roles in diverse fundamental and vital biological processes. The dysregulations of such biomolecules can cause diseases. Identifying the associations between biomolecules and diseases can uncover the mechanisms of complex diseases, which is conducive to their diagnosis, treatment, prognosis and prevention. Due to the time consumption and cost of biologically experimental methods, many computational association prediction methods have been proposed in the past few years. In this study, we provide a comprehensive review of machine learning-based approaches for predicting disease-biomolecule associations with multi-view data sources. Firstly, we introduce some databases and general strategies for integrating multi-view data sources in the prediction models. Then we discuss several feature representation methods for machine learning-based prediction models. Thirdly, we comprehensively review machine learning-based prediction approaches in three categories: basic machine learning methods, matrix completion-based methods and deep learning-based methods, while discussing their advantages and disadvantages. Finally, we provide some perspectives for further improving biomolecule-disease prediction methods.
生物分子,如 microRNAs、circRNAs、lncRNAs 和基因,在人类细胞中具有功能上的相互依赖性,它们都在各种基本和重要的生物学过程中发挥着关键作用。这些生物分子的失调会导致疾病。识别生物分子与疾病之间的关联可以揭示复杂疾病的机制,有助于其诊断、治疗、预后和预防。由于生物实验方法的时间消耗和成本,过去几年提出了许多计算关联预测方法。在这项研究中,我们提供了一个全面的综述,介绍了基于机器学习的方法,用于预测具有多视图数据源的疾病-生物分子关联。首先,我们介绍了一些数据库和在预测模型中整合多视图数据源的一般策略。然后,我们讨论了几种用于基于机器学习的预测模型的特征表示方法。第三,我们全面综述了基于机器学习的三种预测方法:基础机器学习方法、基于矩阵补全的方法和基于深度学习的方法,同时讨论了它们的优缺点。最后,我们为进一步改进生物分子-疾病预测方法提供了一些观点。