Novartis Biomedical Research, Novartis Campus, 4002, Basel, Switzerland.
Nat Commun. 2024 Jul 9;15(1):5764. doi: 10.1038/s41467-024-49979-3.
Machine learning (ML) systems can model quantitative structure-property relationships (QSPR) using existing experimental data and make property predictions for new molecules. With the advent of modalities such as targeted protein degraders (TPD), the applicability of QSPR models is questioned and ML usage in TPD-centric projects remains limited. Herein, ML models are developed and evaluated for TPDs' property predictions, including passive permeability, metabolic clearance, cytochrome P450 inhibition, plasma protein binding, and lipophilicity. Interestingly, performance on TPDs is comparable to that of other modalities. Predictions for glues and heterobifunctionals often yield lower and higher errors, respectively. For permeability, CYP3A4 inhibition, and human and rat microsomal clearance, misclassification errors into high and low risk categories are lower than 4% for glues and 15% for heterobifunctionals. For all modalities, misclassification errors range from 0.8% to 8.1%. Investigated transfer learning strategies improve predictions for heterobifunctionals. This is the first comprehensive evaluation of ML for the prediction of absorption, distribution, metabolism, and excretion (ADME) and physicochemical properties of TPD molecules, including heterobifunctional and molecular glue sub-modalities. Taken together, our investigations show that ML-based QSPR models are applicable to TPDs and support ML usage for TPDs' design, to potentially accelerate drug discovery.
机器学习 (ML) 系统可以使用现有实验数据对定量构效关系 (QSPR) 进行建模,并对新分子进行性质预测。随着靶向蛋白降解剂 (TPD) 等模式的出现,QSPR 模型的适用性受到质疑,并且 ML 在以 TPD 为中心的项目中的使用仍然有限。本文针对 TPD 的性质预测,包括被动渗透性、代谢清除率、细胞色素 P450 抑制、血浆蛋白结合和脂溶性,开发和评估了 ML 模型。有趣的是,TPD 上的性能与其他模式相当。对于胶和杂双功能化合物,其预测结果分别倾向于较低和较高的误差。对于渗透性、CYP3A4 抑制以及人和大鼠微粒体清除率,对于胶,将高风险和低风险类别分类错误的比例低于 4%,对于杂双功能化合物,该比例低于 15%。对于所有模式,分类错误比例在 0.8%至 8.1%之间。研究的迁移学习策略提高了杂双功能化合物的预测性能。这是首次对 TPD 分子的吸收、分布、代谢和排泄 (ADME) 以及物理化学性质的 ML 预测进行全面评估,包括杂双功能和分子胶亚模式。总之,我们的研究表明,基于 ML 的 QSPR 模型适用于 TPD,并支持 ML 在 TPD 设计中的使用,以潜在地加速药物发现。