KU Leuven, Campus KULAK, Faculty of Medicine, Kortrijk, Belgium.
ITEC, imec research group at KU Leuven, Kortrijk, Belgium.
BMC Bioinformatics. 2020 Feb 7;21(1):49. doi: 10.1186/s12859-020-3379-z.
Computational prediction of drug-target interactions (DTI) is vital for drug discovery. The experimental identification of interactions between drugs and target proteins is very onerous. Modern technologies have mitigated the problem, leveraging the development of new drugs. However, drug development remains extremely expensive and time consuming. Therefore, in silico DTI predictions based on machine learning can alleviate the burdensome task of drug development. Many machine learning approaches have been proposed over the years for DTI prediction. Nevertheless, prediction accuracy and efficiency are persisting problems that still need to be tackled. Here, we propose a new learning method which addresses DTI prediction as a multi-output prediction task by learning ensembles of multi-output bi-clustering trees (eBICT) on reconstructed networks. In our setting, the nodes of a DTI network (drugs and proteins) are represented by features (background information). The interactions between the nodes of a DTI network are modeled as an interaction matrix and compose the output space in our problem. The proposed approach integrates background information from both drug and target protein spaces into the same global network framework.
We performed an empirical evaluation, comparing the proposed approach to state of the art DTI prediction methods and demonstrated the effectiveness of the proposed approach in different prediction settings. For evaluation purposes, we used several benchmark datasets that represent drug-protein networks. We show that output space reconstruction can boost the predictive performance of tree-ensemble learning methods, yielding more accurate DTI predictions.
We proposed a new DTI prediction method where bi-clustering trees are built on reconstructed networks. Building tree-ensemble learning models with output space reconstruction leads to superior prediction results, while preserving the advantages of tree-ensembles, such as scalability, interpretability and inductive setting.
计算药物-靶标相互作用(DTI)的预测对于药物发现至关重要。药物与靶蛋白之间相互作用的实验鉴定非常繁琐。现代技术通过开发新药缓解了这个问题。然而,药物开发仍然非常昂贵和耗时。因此,基于机器学习的计算 DTI 预测可以减轻药物开发的繁重任务。多年来,已经提出了许多机器学习方法来进行 DTI 预测。然而,预测准确性和效率仍然是需要解决的持续问题。在这里,我们提出了一种新的学习方法,通过在重构网络上学习多输出双聚类树(eBICT)集成来解决 DTI 预测作为多输出预测任务。在我们的设置中,DTI 网络(药物和蛋白质)的节点由特征(背景信息)表示。DTI 网络节点之间的相互作用被建模为一个相互作用矩阵,构成我们问题的输出空间。所提出的方法将来自药物和靶蛋白空间的背景信息集成到同一个全局网络框架中。
我们进行了实证评估,将所提出的方法与最先进的 DTI 预测方法进行了比较,并在不同的预测设置中证明了所提出方法的有效性。为了评估目的,我们使用了几个代表药物-蛋白质网络的基准数据集。我们表明,输出空间重构可以提高树集成学习方法的预测性能,从而产生更准确的 DTI 预测。
我们提出了一种新的 DTI 预测方法,其中双聚类树构建在重构网络上。使用输出空间重构构建树集成学习模型可导致更好的预测结果,同时保留树集成的优势,如可扩展性、可解释性和归纳设置。