Division of Pathology, The Norwegian Radium Hospital, Rikshospitalet University Hospital, Montebello 0310 Oslo, Norway.
BMC Bioinformatics. 2010 Jan 19;11:36. doi: 10.1186/1471-2105-11-36.
The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets.
This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region.
The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes.
各种“组学”数据集的可用性为研究全基因组遗传调控网络创造了前景。然而,使用数学模型从微阵列数据集推断遗传调控的主要挑战之一是缺乏有关蛋白质浓度和活性的信息。尽管并非总是如此,但大多数先前的研究都基于这样一种假设,即基因的 mRNA 水平与其蛋白质活性一致。因此,需要更复杂的建模框架和相应的推断方法,才能从“组学”数据集中准确估计遗传调控。
这项工作开发了一种新方法,该方法基于非线性数学模型,可从微阵列基因表达数据中推断遗传调控。通过使用 p53 网络作为测试系统,我们使用非线性模型来根据其靶基因的表达水平来估计转录因子(TF)p53 的活性,并确定 p53 对其靶基因的激活/抑制状态。通过 DNA 序列分析,对预测的前 317 个假定的 p53 靶基因进行了验证。我们的预测与其他已发表的 p53 靶基因预测之间的比较表明,大多数假定的 p53 靶基因可能在上游非编码区具有共同的耗尽或富集序列信号。
所提出的定量模型不仅可以用于推断 TF 与其下游基因之间的调控关系,还可以用于根据其靶基因的表达水平来估计 TF 的蛋白质活性。