Eichner Johannes, Wrzodek Clemens, Römer Michael, Ellinger-Ziegelbauer Heidrun, Zell Andreas
Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Tübingen, Germany.
Global Early Development, Bayer Pharma AG, Wuppertal, Germany.
PLoS One. 2014 May 14;9(5):e97678. doi: 10.1371/journal.pone.0097678. eCollection 2014.
The current gold-standard method for cancer safety assessment of drugs is a rodent two-year bioassay, which is associated with significant costs and requires testing a high number of animals over lifetime. Due to the absence of a comprehensive set of short-term assays predicting carcinogenicity, new approaches are currently being evaluated. One promising approach is toxicogenomics, which by virtue of genome-wide molecular profiling after compound treatment can lead to an increased mechanistic understanding, and potentially allow for the prediction of a carcinogenic potential via mathematical modeling. The latter typically involves the extraction of informative genes from omics datasets, which can be used to construct generalizable models allowing for the early classification of compounds with unknown carcinogenic potential. Here we formally describe and compare two novel methodologies for the reproducible extraction of characteristic mRNA signatures, which were employed to capture specific gene expression changes observed for nongenotoxic carcinogens. While the first method integrates multiple gene rankings, generated by diverse algorithms applied to data from different subsamplings of the training compounds, the second approach employs a statistical ratio for the identification of informative genes. Both methods were evaluated on a dataset obtained from the toxicogenomics database TG-GATEs to predict the outcome of a two-year bioassay based on profiles from 14-day treatments. Additionally, we applied our methods to datasets from previous studies and showed that the derived prediction models are on average more accurate than those built from the original signatures. The selected genes were mostly related to p53 signaling and to specific changes in anabolic processes or energy metabolism, which are typically observed in tumor cells. Among the genes most frequently incorporated into prediction models were Phlda3, Cdkn1a, Akr7a3, Ccng1 and Abcb4.
目前用于药物癌症安全性评估的金标准方法是啮齿动物两年生物测定法,该方法成本高昂,且需要在动物的整个生命周期内对大量动物进行测试。由于缺乏一套全面的预测致癌性的短期测定方法,目前正在评估新的方法。一种有前景的方法是毒理基因组学,通过化合物处理后的全基因组分子谱分析,毒理基因组学可以增进对作用机制的理解,并有可能通过数学建模预测致癌潜力。后者通常涉及从组学数据集中提取信息基因,这些基因可用于构建可推广的模型,以便对具有未知致癌潜力的化合物进行早期分类。在这里,我们正式描述并比较两种用于可重复提取特征mRNA特征的新方法,这两种方法用于捕获非遗传毒性致癌物观察到的特定基因表达变化。第一种方法整合了多种基因排名,这些排名由应用于训练化合物不同子样本数据的不同算法生成,而第二种方法采用统计比率来识别信息基因。两种方法均在从毒理基因组学数据库TG-GATEs获得的数据集上进行评估,以根据14天处理的谱预测两年生物测定的结果。此外,我们将我们的方法应用于先前研究的数据集,并表明衍生的预测模型平均比基于原始特征构建的模型更准确。所选基因大多与p53信号传导以及合成代谢过程或能量代谢的特定变化有关,这些变化通常在肿瘤细胞中观察到。最常纳入预测模型的基因包括Phlda3、Cdkn1a、Akr7a3、Ccng1和Abcb4。