Lu Zhiheng K, Allen O Brian, Desmond Anthony F
Metastract Inc.
Stat Appl Genet Mol Biol. 2012 Dec 14;11(6):/j/sagmb.2012.11.issue-6/1544-6115.1818/1544-6115.1818.xml. doi: 10.1515/1544-6115.1818.
Gene expression profiles from microarray time course experiments provide a unique opportunity to examine genome-wide signal processing and gene responses. A fundamental issue in microarray experiments is that the treatment condition can only be controlled at the cell level rather than at the gene level. The treatment condition does not affect all genes equally. Some genes depend on other genes to detect external changes. The dependency between genes is not fully deterministic and may vary with treatment condition. Thus the expression of each gene is potentially affected by two confounding effects: the treatment effect and the gene context effect arising from the regulatory interactions among genes. This gene context effect is hard to isolate. Neither can it be simply ignored. Instead, this gene context information which may be different under different treatment conditions is of primary biological interest. We introduce an approach which deals with the confounding effects and takes into account the uncontrollable gene context effect. Our method is based on the estimation of the number of hidden states, which, in our development, corresponds to the order of a hidden Markov model (HMM). For each gene, its observed expression is modeled by a gamma distribution determined by the corresponding hidden state at each time point. Those genes showing evidence for more than one hidden state can be categorized as the signalling genes, or in a wider sense, as the response genes which are coordinated by a cell system in reaction to a specific external condition. These response genes can be used in the comparison of different treatment conditions, to investigate the gene context effect under different treatments. Microarray time course data are also analyzed to demonstrate our method.
来自微阵列时间进程实验的基因表达谱为检查全基因组信号处理和基因反应提供了独特的机会。微阵列实验中的一个基本问题是,处理条件只能在细胞水平而非基因水平上得到控制。处理条件对所有基因的影响并不相同。一些基因依赖于其他基因来检测外部变化。基因之间的依赖性并非完全确定,并且可能随处理条件而变化。因此,每个基因的表达可能受到两种混杂效应的影响:处理效应以及由基因之间的调控相互作用产生的基因背景效应。这种基因背景效应很难分离。它也不能被简单地忽略。相反,这种在不同处理条件下可能不同的基因背景信息具有主要的生物学意义。我们引入了一种处理混杂效应并考虑不可控基因背景效应的方法。我们的方法基于对隐藏状态数量的估计,在我们的研究中,这对应于一个隐马尔可夫模型(HMM)的阶数。对于每个基因,其观察到的表达由在每个时间点由相应隐藏状态确定的伽马分布建模。那些显示出不止一种隐藏状态证据的基因可以被归类为信号基因,或者从更广泛的意义上说,作为由细胞系统协调以响应特定外部条件的反应基因。这些反应基因可用于不同处理条件的比较,以研究不同处理下的基因背景效应。还对微阵列时间进程数据进行了分析以证明我们的方法。