Zheng Mingyue, Luo Xiaomin, Shen Qiancheng, Wang Yong, Du Yun, Zhu Weiliang, Jiang Hualiang
Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China.
Bioinformatics. 2009 May 15;25(10):1251-8. doi: 10.1093/bioinformatics/btp140. Epub 2009 Mar 13.
One goal of metabolomics is to define and monitor the entire metabolite complement of a cell, while it is still far from reach since systematic and rapid approaches for determining the biotransformations of newly discovered metabolites are lacking. For drug development, such metabolic biotransformation of a new chemical entity (NCE) is of more interest because it may profoundly affect its bioavailability, activity and toxicity profile. The use of in silico methods to predict the site of metabolism (SOM) in phase I cytochromes P450-mediated reactions is usually a starting point of metabolic pathway studies, which may also assist in the process of drug/lead optimization.
This article reports the Cytochromes P450 (CYP450)-mediated SOM prediction for the six most important metabolic reactions by incorporating the use of machine learning and semi-empirical quantum chemical calculations. Non-local models were developed on the basis of a large dataset comprising 1858 metabolic reactions extracted from 1034 heterogeneous chemicals. For validation, the overall accuracies of all six reaction types are higher than 0.81, four of which exceed 0.90. In further receiver operating characteristic (ROC) analyses, each of the SOM model gave a significant area under curve (AUC) value over 0.86, indicating a good predicting power. An external test was made on a previously published dataset, of which 80% of the experimentally observed SOMs can be correctly identified by applying the full set of our SOM models.
The program package SOME_v1.0 (Site Of Metabolism Estimator) developed based on our models is available at http://www.dddc.ac.cn/adme/myzheng/SOME_1_0.tar.gz.
代谢组学的一个目标是定义和监测细胞的整个代谢物组成,但由于缺乏用于确定新发现代谢物生物转化的系统且快速的方法,这一目标仍难以实现。对于药物开发而言,新化学实体(NCE)的这种代谢生物转化更受关注,因为它可能会深刻影响其生物利用度、活性和毒性特征。使用计算机方法预测I期细胞色素P450介导反应中的代谢位点(SOM)通常是代谢途径研究的起点,这也可能有助于药物/先导物优化过程。
本文通过结合机器学习和半经验量子化学计算,报告了细胞色素P450(CYP450)介导的六种最重要代谢反应的SOM预测。基于包含从1034种异质化学物质中提取的1858个代谢反应的大型数据集开发了非局部模型。为了进行验证,所有六种反应类型的总体准确率均高于0.81,其中四种超过0.90。在进一步的受试者工作特征(ROC)分析中,每个SOM模型的曲线下面积(AUC)值均显著超过0.86,表明具有良好的预测能力。对先前发表的数据集进行了外部测试,通过应用我们的全套SOM模型,可以正确识别80%的实验观察到的SOM。
基于我们的模型开发的程序包SOME_v1.0(代谢位点估计器)可从http://www.dddc.ac.cn/adme/myzheng/SOME_1_0.tar.gz获得。