Cortés-Ciriano Isidro, Bender Andreas, Malliavin Thérèse
Institut Pasteur, Unité de Bioinformatique Structurale, CNRS UMR 3825, Département de Biologie, Structurale et Chimie, 25, rue du Dr Roux, 75015, Paris, France.
Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK.
Mol Inform. 2015 Jun;34(6-7):357-66. doi: 10.1002/minf.201400165. Epub 2015 Mar 20.
Poly(ADP-ribose) polymerases (PARPs) play a key role in DNA damage repair. PARP inhibitors act as chemo- and radio- sensitizers and thus potentiate the cytotoxicity of DNA damaging agents. Although PARP inhibitors are currently investigated as chemotherapeutic agents, their cross-reactivity with other members of the PARP family remains unclear. Here, we apply Proteochemometric Modelling (PCM) to model the activity of 181 compounds on 12 human PARPs. We demonstrate that PCM (R0 (2) test =0.65-0.69; RMSEtest =0.95-1.01 °C) displays higher performance on the test set (interpolation) than Family QSAR and Family QSAM (Tukey's HSD, α 0.05), and outperforms Inductive Transfer knowledge among targets (Tukey's HSD, α 0.05). We benchmark the predictive signal of 8 amino acid and 11 full-protein sequence descriptors, obtaining that all of them (except for SOCN) perform at the same level of statistical significance (Tukey's HSD, α 0.05). The extrapolation power of PCM to new compounds (RMSE=1.02±0.80 °C) and targets (RMSE=1.03±0.50 °C) is comparable to interpolation, although the extrapolation ability is not uniform across the chemical and the target space. For this reason, we also provide confidence intervals calculated with conformal prediction. In addition, we present the R package conformal, which permits the calculation of confidence intervals for regression and classification caret models.
聚(ADP - 核糖)聚合酶(PARP)在DNA损伤修复中起关键作用。PARP抑制剂作为化学增敏剂和放射增敏剂,因此可增强DNA损伤剂的细胞毒性。尽管PARP抑制剂目前作为化疗药物进行研究,但其与PARP家族其他成员的交叉反应性仍不清楚。在这里,我们应用蛋白质化学计量学建模(PCM)来模拟181种化合物对12种人类PARP的活性。我们证明PCM(R0(2)检验 = 0.65 - 0.69;RMSE检验 = 0.95 - 1.01°C)在测试集(内插)上的表现优于家族定量构效关系(Family QSAR)和家族定量结构活性模型(Family QSAM)(Tukey's HSD,α = 0.05),并且优于目标间的归纳转移知识(Tukey's HSD,α = 0.05)。我们对8个氨基酸和11个全蛋白序列描述符的预测信号进行了基准测试,发现它们所有(除了SOCN)在统计显著性水平上表现相同(Tukey's HSD,α = 0.05)。PCM对新化合物(RMSE = 1.02±0.80°C)和新靶点(RMSE = 1.03±0.50°C)的外推能力与内插相当,尽管外推能力在化学空间和靶点空间中并不一致。因此,我们还提供了用共形预测计算的置信区间。此外,我们展示了R包conformal,它允许计算回归和分类caret模型的置信区间。