Biomedical Informatics Training Program, Stanford University School of Medicine and ‡Departments of Bioengineering and Genetics, Stanford University , Stanford, California 94305, United States.
J Chem Inf Model. 2013 Oct 28;53(10):2765-73. doi: 10.1021/ci3005868. Epub 2013 Oct 2.
Despite recent advances in molecular medicine and rational drug design, many drugs still fail because toxic effects arise at the cellular and tissue level. In order to better understand these effects, cellular assays can generate high-throughput measurements of gene expression changes induced by small molecules. However, our understanding of how the chemical features of small molecules influence gene expression is very limited. Therefore, we investigated the extent to which chemical features of small molecules can reliably be associated with significant changes in gene expression. Specifically, we analyzed the gene expression response of rat liver cells to 170 different drugs and searched for genes whose expression could be related to chemical features alone. Surprisingly, we can predict the up-regulation of 87 genes (increased expression of at least 1.5 times compared to controls). We show an average cross-validation predictive area under the receiver operating characteristic curve (AUROC) of 0.7 or greater for each of these 87 genes. We applied our method to an external data set of rat liver gene expression response to a novel drug and achieved an AUROC of 0.7. We also validated our approach by predicting up-regulation of Cytochrome P450 1A2 (CYP1A2) in three drugs known to induce CYP1A2 that were not in our data set. Finally, a detailed analysis of the CYP1A2 predictor allowed us to identify which fragments made significant contributions to the predictive scores.
尽管分子医学和合理药物设计方面取得了最近的进展,但许多药物仍因细胞和组织水平出现的毒性作用而失败。为了更好地了解这些作用,细胞分析可以对小分子引起的基因表达变化进行高通量测量。然而,我们对小分子的化学特征如何影响基因表达的理解非常有限。因此,我们研究了小分子的化学特征在多大程度上可以可靠地与基因表达的显著变化相关联。具体来说,我们分析了大鼠肝细胞对 170 种不同药物的基因表达反应,并寻找其表达可以仅与化学特征相关的基因。令人惊讶的是,我们可以预测 87 个基因的上调(与对照相比表达增加至少 1.5 倍)。我们对这 87 个基因中的每一个都显示了平均交叉验证接收器操作特性曲线(AUROC)为 0.7 或更高的预测值。我们将我们的方法应用于大鼠肝脏对新型药物的基因表达反应的外部数据集,并获得了 0.7 的 AUROC。我们还通过预测三种已知诱导 CYP1A2 的药物中的 CYP1A2 上调来验证我们的方法,而这些药物不在我们的数据集内。最后,对 CYP1A2 预测因子的详细分析使我们能够确定哪些片段对预测分数做出了重大贡献。