Kim Bong-Rae, Zhang Li, Berg Arthur, Fan Jianqing, Wu Rongling
Department of Operation Research and Financial Engineering, Princeton University, Princeton, New Jersey 08544, USA.
Genetics. 2008 Oct;180(2):821-34. doi: 10.1534/genetics.108.093690. Epub 2008 Sep 9.
DNA microarray analysis has emerged as a leading technology to enhance our understanding of gene regulation and function in cellular mechanism controls on a genomic scale. This technology has advanced to unravel the genetic machinery of biological rhythms by collecting massive gene-expression data in a time course. Here, we present a statistical model for clustering periodic patterns of gene expression in terms of different transcriptional profiles. The model incorporates biologically meaningful Fourier series approximations of gene periodic expression into a mixture-model-based likelihood function, thus producing results that are likely to be closer to biological relevance, as compared to those from existing models. Also because the structures of the time-dependent means and covariance matrix are modeled, the new approach displays increased statistical power and precision of parameter estimation. The approach was used to reanalyze a real example with 800 periodically expressed transcriptional genes in yeast, leading to the identification of 13 distinct patterns of gene-expression cycles. The model proposed can be useful for characterizing the complex biological effects of gene expression and generate testable hypotheses about the workings of developmental systems in a more precise quantitative way.
DNA微阵列分析已成为一项领先技术,可增强我们在基因组规模上对细胞机制控制中基因调控和功能的理解。该技术已发展到通过在时间进程中收集大量基因表达数据来揭示生物节律的遗传机制。在此,我们提出一种统计模型,用于根据不同的转录谱对基因表达的周期性模式进行聚类。该模型将基因周期性表达的具有生物学意义的傅里叶级数近似纳入基于混合模型的似然函数中,因此与现有模型相比,产生的结果可能更接近生物学相关性。此外,由于对随时间变化的均值和协方差矩阵的结构进行了建模,新方法显示出更高的统计能力和参数估计精度。该方法被用于重新分析一个酵母中800个周期性表达转录基因的实际例子,从而识别出13种不同的基因表达周期模式。所提出的模型可用于表征基因表达的复杂生物学效应,并以更精确的定量方式生成关于发育系统运作的可检验假设。