Sun Hongmao, Veith Henrike, Xia Menghang, Austin Christopher P, Tice Raymond R, Huang Ruili
National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA.
Mol Inform. 2012 Nov 1;31(11-12):783-792. doi: 10.1002/minf.201200065. Epub 2012 Oct 11.
The human cytochrome P450 (CYP) enzyme family is involved in the biotransformation of many xenobiotics. As part of the U.S. Tox21 Phase I effort, we profiled the CYP activity of approximately three thousand compounds, primarily those of environmental concern, against human CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 isoforms in a quantitative high throughput screening (qHTS) format. In order to evaluate the extent to which computational models built from a drug-like library screened in these five CYP assays under the same conditions can accurately predict the outcome of an environmental compound library, five support vector machines (SVM) models built from over 17,000 drug-like compounds were challenged to predict the CYP activities of the Tox21 compound collection. Although a large fraction of the test compounds fall outside of the applicability domain (AD) of the models, as measured by -nearest neighbor (-NN) similarities, the predictions were largely accurate for CYP1A2, CYP2C9, and CYP3A4 ioszymes with area under the receiver operator characteristic curves (AUC-ROC) ranging between 0.82 and 0.84. The lower predictive power of the CYP2C19 model (AUC-ROC = 0.76) is caused by experimental errors and that of the CYP2D6 model (AUC-ROC = 0.76) can be rescued by rebalancing the training data. Our results demonstrate that decomposing molecules into atom types enhanced the coverage of the AD and that computational models built from drug-like molecules can be used to predict the ability of non-drug like compounds to interact with these CYPs.
人类细胞色素P450(CYP)酶家族参与多种外源性物质的生物转化。作为美国Tox21一期项目的一部分,我们采用定量高通量筛选(qHTS)方法,对约三千种化合物(主要是环境相关化合物)针对人类CYP1A2、CYP2C19、CYP2C9、CYP2D6和CYP3A4同工型的CYP活性进行了分析。为了评估在相同条件下从这五种CYP检测中筛选的类药物库构建的计算模型能够准确预测环境化合物库结果的程度,我们用从超过17000种类药物化合物构建的五个支持向量机(SVM)模型来预测Tox21化合物集的CYP活性。尽管通过最近邻(-NN)相似度衡量,很大一部分测试化合物落在模型的适用域(AD)之外,但对于CYP1A2、CYP2C9和CYP3A4同工酶,预测在很大程度上是准确的,受试者操作特征曲线下面积(AUC-ROC)在0.82至0.84之间。CYP2C19模型较低的预测能力(AUC-ROC = 0.76)是由实验误差导致的,而CYP2D6模型(AUC-ROC = 0.76)的预测能力可通过重新平衡训练数据得到改善。我们的结果表明,将分子分解为原子类型可扩大适用域的覆盖范围,并且从类药物分子构建的计算模型可用于预测非类药物化合物与这些CYP相互作用的能力。