Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China.
J Chem Inf Model. 2022 Jun 13;62(11):2788-2799. doi: 10.1021/acs.jcim.2c00297. Epub 2022 May 24.
The prediction and optimization of pharmacokinetic properties are essential in lead optimization. Traditional strategies mainly depend on the empirical chemical rules from medicinal chemists. However, with the rising amount of data, it is getting more difficult to manually extract useful medicinal chemistry knowledge. To this end, we introduced IDL-PPBopt, a computational strategy for predicting and optimizing the plasma protein binding (PPB) property based on an interpretable deep learning method. At first, a curated PPB data set was used to construct an interpretable deep learning model, which showed excellent predictive performance with a root mean squared error of 0.112 for the entire test set. Then, we designed a detection protocol based on the model and Wilcoxon test to identify the PPB-related substructures (named privileged substructures, PSubs) for each molecule. In total, 22 general privileged substructures (GPSubs) were identified, which shared some common features such as nitrogen-containing groups, diamines with two carbon units, and azetidine. Furthermore, a series of second-level chemical rules for each GPSub were derived through a statistical test and then summarized into substructure pairs. We demonstrated that these substructure pairs were equally applicable outside the training set and accordingly customized the structural modification schemes for each GPSub, which provided alternatives for the optimization of the PPB property. Therefore, IDL-PPBopt provides a promising scheme for the prediction and optimization of the PPB property and would be helpful for lead optimization of other pharmacokinetic properties.
在先导优化中,预测和优化药代动力学性质是至关重要的。传统策略主要依赖于药物化学家的经验化学规则。然而,随着数据量的增加,手动提取有用的药物化学知识变得越来越困难。为此,我们引入了 IDLPBopt,这是一种基于可解释深度学习方法预测和优化血浆蛋白结合(PPB)性质的计算策略。首先,我们使用经过精心整理的 PPB 数据集来构建可解释的深度学习模型,该模型在整个测试集上表现出出色的预测性能,均方根误差为 0.112。然后,我们基于该模型和 Wilcoxon 检验设计了一种检测方案,以识别每个分子的与 PPB 相关的亚结构(称为特权亚结构,PSubs)。总共确定了 22 个通用特权亚结构(GPSubs),它们具有一些共同的特征,例如含氮基团、两个碳原子的二胺和氮杂环丁烷。此外,通过统计检验得出了每个 GPSub 的一系列二级化学规则,并将其总结为亚结构对。我们证明了这些亚结构对在训练集之外同样适用,并相应地为每个 GPSub 定制了结构修改方案,为优化 PPB 性质提供了替代方案。因此,IDL-PPBopt 为预测和优化 PPB 性质提供了一个有前途的方案,并将有助于其他药代动力学性质的先导优化。