Center for Computational Biology, Mines ParisTech, PSL Research University, Paris, France.
Institut Curie F-75248, Paris, France.
PLoS One. 2018 Oct 4;13(10):e0204999. doi: 10.1371/journal.pone.0204999. eCollection 2018.
Adverse drug reactions, also called side effects, range from mild to fatal clinical events and significantly affect the quality of care. Among other causes, side effects occur when drugs bind to proteins other than their intended target. As experimentally testing drug specificity against the entire proteome is out of reach, we investigate the application of chemogenomics approaches. We formulate the study of drug specificity as a problem of predicting interactions between drugs and proteins at the proteome scale. We build several benchmark datasets, and propose NN-MT, a multi-task Support Vector Machine (SVM) algorithm that is trained on a limited number of data points, in order to solve the computational issues or proteome-wide SVM for chemogenomics. We compare NN-MT to different state-of-the-art methods, and show that its prediction performances are similar or better, at an efficient calculation cost. Compared to its competitors, the proposed method is particularly efficient to predict (protein, ligand) interactions in the difficult double-orphan case, i.e. when no interactions are previously known for the protein nor for the ligand. The NN-MT algorithm appears to be a good default method providing state-of-the-art or better performances, in a wide range of prediction scenario that are considered in the present study: proteome-wide prediction, protein family prediction, test (protein, ligand) pairs dissimilar to pairs in the train set, and orphan cases.
药物不良反应,也称副作用,范围从轻度到致命的临床事件,并显著影响医疗质量。除其他原因外,当药物与除预期靶标以外的蛋白质结合时,就会发生副作用。由于实验测试药物针对整个蛋白质组的特异性是无法实现的,我们研究了化学生物组学方法的应用。我们将药物特异性的研究表述为预测药物与蛋白质组范围内蛋白质之间相互作用的问题。我们构建了几个基准数据集,并提出了 NN-MT,这是一种多任务支持向量机(SVM)算法,它可以在有限数量的数据点上进行训练,以解决计算问题或蛋白质组范围的化学生物组学 SVM。我们将 NN-MT 与不同的最先进方法进行比较,并表明其预测性能相似或更好,计算成本效率更高。与竞争对手相比,该方法在预测困难的双重孤儿案例(即蛋白质和配体均无先前已知相互作用)中的(蛋白质,配体)相互作用时特别有效。NN-MT 算法似乎是一种很好的默认方法,可以在本研究中考虑的广泛预测场景中提供最先进或更好的性能:蛋白质组范围的预测、蛋白质家族预测、测试(蛋白质,配体)对与训练集中的对不相似的情况,以及孤儿案例。