Mazzolari Angelica, Afzal Avid M, Pedretti Alessandro, Testa Bernard, Vistoli Giulio, Bender Andreas
Dipartimento di Scienze Farmaceutiche, Facoltà di Scienze del Farmaco, Università degli Studi di Milano, Via Mangiagalli, I-20133 Milano, Italy.
Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW Cambridge, U.K.
ACS Med Chem Lett. 2019 Feb 12;10(4):633-638. doi: 10.1021/acsmedchemlett.8b00603. eCollection 2019 Apr 11.
Even though glucuronidations are the most frequent metabolic reactions of conjugation, both in quantitative and qualitative terms, they have rather seldom been investigated using computational approaches. To fill this gap, we have used the manually collected MetaQSAR metabolic reaction database to generate two models for the prediction of UGT-mediated metabolism, both based on molecular descriptors and implementing the Random Forest algorithm. The first model predicts the occurrence of the reaction and was internally validated with a Matthew correlation coefficient (MCC) of 0.76 and an area under the ROC curve (AUC) of 0.94, and further externally validated using a test set composed of 120 additional xenobiotics (MCC of 0.70 and AUC of 0.90). The second model distinguishes between O- and N-glucuronidations and was optimized by the random undersampling procedure to improve the predictive accuracy during the internal validation, with the recall measure of the minority class increasing from 0.55 to 0.78.
尽管从数量和质量方面来看,葡萄糖醛酸化反应是最常见的结合代谢反应,但使用计算方法对其进行研究的情况却相当少见。为了填补这一空白,我们使用了人工收集的MetaQSAR代谢反应数据库,基于分子描述符并采用随机森林算法,生成了两个用于预测UGT介导代谢的模型。第一个模型预测反应的发生情况,内部验证时马修斯相关系数(MCC)为0.76,ROC曲线下面积(AUC)为0.94,并且使用由另外120种异生素组成的测试集进行了进一步外部验证(MCC为0.70,AUC为0.90)。第二个模型区分O-葡萄糖醛酸化和N-葡萄糖醛酸化,并通过随机欠采样程序进行优化,以提高内部验证期间的预测准确性,少数类别的召回率从0.55提高到了0.78。