Taylor Ronald C, Acquaah-Mensah George, Singhal Mudita, Malhotra Deepti, Biswal Shyam
Computational Biology and Bioinformatics Group, Pacific Northwest National Laboratory, U.S. Department of Energy, Richland, Washington, United States of America.
PLoS Comput Biol. 2008 Aug 29;4(8):e1000166. doi: 10.1371/journal.pcbi.1000166.
A variety of cardiovascular, neurological, and neoplastic conditions have been associated with oxidative stress, i.e., conditions under which levels of reactive oxygen species (ROS) are elevated over significant periods. Nuclear factor erythroid 2-related factor (Nrf2) regulates the transcription of several gene products involved in the protective response to oxidative stress. The transcriptional regulatory and signaling relationships linking gene products involved in the response to oxidative stress are, currently, only partially resolved. Microarray data constitute RNA abundance measures representing gene expression patterns. In some cases, these patterns can identify the molecular interactions of gene products. They can be, in effect, proxies for protein-protein and protein-DNA interactions. Traditional techniques used for clustering coregulated genes on high-throughput gene arrays are rarely capable of distinguishing between direct transcriptional regulatory interactions and indirect ones. In this study, newly developed information-theoretic algorithms that employ the concept of mutual information were used: the Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNE), and Context Likelihood of Relatedness (CLR). These algorithms captured dependencies in the gene expression profiles of the mouse lung, allowing the regulatory effect of Nrf2 in response to oxidative stress to be determined more precisely. In addition, a characterization of promoter sequences of Nrf2 regulatory targets was conducted using a Support Vector Machine classification algorithm to corroborate ARACNE and CLR predictions. Inferred networks were analyzed, compared, and integrated using the Collective Analysis of Biological Interaction Networks (CABIN) plug-in of Cytoscape. Using the two network inference algorithms and one machine learning algorithm, a number of both previously known and novel targets of Nrf2 transcriptional activation were identified. Genes predicted as novel Nrf2 targets include Atf1, Srxn1, Prnp, Sod2, Als2, Nfkbib, and Ppp1r15b. Furthermore, microarray and quantitative RT-PCR experiments following cigarette-smoke-induced oxidative stress in Nrf2(+/+) and Nrf2(-/-) mouse lung affirmed many of the predictions made. Several new potential feed-forward regulatory loops involving Nrf2, Nqo1, Srxn1, Prdx1, Als2, Atf1, Sod1, and Park7 were predicted. This work shows the promise of network inference algorithms operating on high-throughput gene expression data in identifying transcriptional regulatory and other signaling relationships implicated in mammalian disease.
多种心血管、神经和肿瘤疾病都与氧化应激有关,即活性氧(ROS)水平在较长时期内升高的情况。核因子红细胞2相关因子(Nrf2)调节参与氧化应激保护反应的几种基因产物的转录。目前,参与氧化应激反应的基因产物之间的转录调控和信号传导关系仅得到部分解析。微阵列数据构成了代表基因表达模式的RNA丰度测量值。在某些情况下,这些模式可以识别基因产物的分子相互作用。实际上,它们可以作为蛋白质-蛋白质和蛋白质-DNA相互作用的替代物。用于在高通量基因阵列上对共调控基因进行聚类的传统技术很少能够区分直接转录调控相互作用和间接转录调控相互作用。在本研究中,使用了新开发的采用互信息概念的信息论算法:精确细胞网络重建算法(ARACNE)和相关性上下文似然法(CLR)。这些算法捕捉了小鼠肺基因表达谱中的依赖性,从而能够更精确地确定Nrf2对氧化应激的调节作用。此外,使用支持向量机分类算法对Nrf2调控靶点的启动子序列进行了表征,以证实ARACNE和CLR的预测。使用Cytoscape的生物相互作用网络集体分析(CABIN)插件对推断出的网络进行分析、比较和整合。使用这两种网络推断算法和一种机器学习算法,确定了许多Nrf2转录激活的已知和新靶点。预测为新的Nrf2靶点的基因包括Atf1、Srxn1、Prnp、Sod2、Als2、Nfkbib和Ppp1r15b。此外,在Nrf2(+/+)和Nrf2(-/-)小鼠肺中进行香烟烟雾诱导的氧化应激后的微阵列和定量RT-PCR实验证实了许多所做的预测。预测了几个涉及Nrf2、Nqo1、Srxn1、Prdx1、Als2、Atf1、Sod1和Park7的新的潜在前馈调节环。这项工作表明,基于高通量基因表达数据运行的网络推断算法在识别与哺乳动物疾病相关的转录调控和其他信号传导关系方面具有前景。