Pyne Saumyadipta, Gutman Roee, Kim Chang Sik, Futcher Bruce
Broad Institute of Massachusetts Institute of Technology and Harvard University, Cambridge, MA 02142, USA.
BMC Genomics. 2009 Sep 17;10:440. doi: 10.1186/1471-2164-10-440.
Many genes oscillate in their level of expression through the cell division cycle. Previous studies have identified such genes by applying Fourier analysis to cell cycle time course experiments. Typically, such analyses generate p-values; i.e., an oscillating gene has a small p-value, and the observed oscillation is unlikely due to chance. When multiple time course experiments are integrated, p-values from the individual experiments are combined using classical meta-analysis techniques. However, this approach sacrifices information inherent in the individual experiments, because the hypothesis that a gene is regulated according to the time in the cell cycle makes two independent predictions: first, that an oscillation in expression will be observed; and second, that gene expression will always peak in the same phase of the cell cycle, such as S-phase. Approaches that simply combine p-values ignore the second prediction.
Here, we improve the detection of cell cycle oscillating genes by systematically taking into account the phase of peak gene expression. We design a novel meta-analysis measure based on vector addition: when a gene peaks or troughs in all experiments in the same phase of the cell cycle, the representative vectors add to produce a large final vector. Conversely, when the peaks in different experiments are in various phases of the cycle, vector addition produces a small final vector. We apply the measure to ten genome-wide cell cycle time course experiments from the fission yeast Schizosaccharomyces pombe, and detect many new, weakly oscillating genes.
A very large fraction of all genes in S. pombe, perhaps one-quarter to one-half, show some cell cycle oscillation, although in many cases these oscillations may be incidental rather than adaptive.
许多基因在细胞分裂周期中其表达水平会发生振荡。先前的研究通过将傅里叶分析应用于细胞周期时间进程实验来鉴定此类基因。通常,此类分析会生成p值;即,振荡基因具有较小的p值,并且观察到的振荡不太可能是偶然的。当整合多个时间进程实验时,使用经典的荟萃分析技术将各个实验的p值进行合并。然而,这种方法牺牲了各个实验中固有的信息,因为基因根据细胞周期中的时间进行调控这一假设做出了两个独立的预测:第一,会观察到表达振荡;第二,基因表达总是会在细胞周期的同一阶段(如S期)达到峰值。简单合并p值的方法忽略了第二个预测。
在此,我们通过系统地考虑基因表达峰值的阶段来改进对细胞周期振荡基因的检测。我们基于向量相加设计了一种新颖的荟萃分析方法:当一个基因在细胞周期的同一阶段在所有实验中达到峰值或谷值时,代表性向量相加会产生一个大的最终向量。相反,当不同实验中的峰值处于细胞周期的不同阶段时,向量相加会产生一个小的最终向量。我们将该方法应用于来自裂殖酵母粟酒裂殖酵母的十个全基因组细胞周期时间进程实验,并检测到许多新的、微弱振荡的基因。
粟酒裂殖酵母中很大一部分基因,可能四分之一到二分之一,表现出一定的细胞周期振荡,尽管在许多情况下这些振荡可能是偶然的而非适应性的。