Dipartimento di Scienze Farmaceutiche, Università degli Studi di Milano, Via Luigi Mangiagalli, 25, I-20133 Milano, Italy.
EXSCALATE, Dompé Farmaceutici S.p.A., Via Tommaso De Amicis, 95, I-80131 Napoli, Italy.
Int J Mol Sci. 2023 Jul 4;24(13):11064. doi: 10.3390/ijms241311064.
The prediction of drug metabolism is attracting great interest for the possibility of discarding molecules with unfavorable ADME/Tox profile at the early stage of the drug discovery process. In this context, artificial intelligence methods can generate highly performing predictive models if they are trained by accurate metabolic data. MetaQSAR-based datasets were collected to predict the sites of metabolism for most metabolic reactions. The models were based on a set of structural, physicochemical, and stereo-electronic descriptors and were generated by the random forest algorithm. For each considered biotransformation, two types of models were developed: the first type involved all non-reactive atoms and included atom types among the descriptors, while the second type involved only non-reactive centers having the same atom type(s) of the reactive atoms. All the models of the first type revealed very high performances; the models of the second type show on average worst performances while being almost always able to recognize the reactive centers; only conjugations with glucuronic acid are unsatisfactorily predicted by the models of the second type. Feature evaluation confirms the major role of lipophilicity, self-polarizability, and H-bonding for almost all considered reactions. The obtained results emphasize the possibility of recognizing the sites of metabolism by classification models trained on MetaQSAR database. The two types of models can be synergistically combined since the first models identify which atoms can undergo a given metabolic reactions, while the second models detect the truly reactive centers. The generated models are available as scripts for the VEGA program.
药物代谢预测因其有可能在药物发现过程的早期淘汰具有不利 ADME/Tox 特征的分子而引起了极大的兴趣。在这种情况下,如果人工智能方法通过准确的代谢数据进行训练,则可以生成性能非常高的预测模型。为此,我们收集了基于 MetaQSAR 的数据集,以预测大多数代谢反应的代谢部位。这些模型基于一组结构、物理化学和立体电子描述符,并通过随机森林算法生成。对于每种考虑的生物转化,我们开发了两种类型的模型:第一种模型涉及所有非反应性原子,并在描述符中包含原子类型,而第二种模型仅涉及具有与反应性原子相同原子类型的非反应性中心。所有第一种类型的模型都表现出非常高的性能;第二种类型的模型平均表现最差,但几乎总是能够识别反应性中心;只有与葡萄糖醛酸的共轭反应不能被第二种类型的模型很好地预测。特征评估证实了几乎所有考虑的反应中亲脂性、自极化和氢键的主要作用。所得结果强调了通过基于 MetaQSAR 数据库训练的分类模型识别代谢部位的可能性。这两种类型的模型可以协同结合,因为第一种模型确定了哪些原子可以发生给定的代谢反应,而第二种模型则检测真正的反应性中心。生成的模型作为 VEGA 程序的脚本提供。