Liang Yulan, Kelemen Arpad
Department of Organizational Systems and Adult Health, University of Maryland, Baltimore, 21201, USA.
Biom J. 2009 Feb;51(1):56-69. doi: 10.1002/bimj.200710489.
Finite mixture models can provide the insights about behavioral patterns as a source of heterogeneity of the various dynamics of time course gene expression data by reducing the high dimensionality and making clear the major components of the underlying structure of the data in terms of the unobservable latent variables. The latent structure of the dynamic transition process of gene expression changes over time can be represented by Markov processes. This paper addresses key problems in the analysis of large gene expression data sets that describe systemic temporal response cascades and dynamic changes to therapeutic doses in multiple tissues, such as liver, skeletal muscle, and kidney from the same animals. Bayesian Finite Markov Mixture Model with a Dirichlet Prior is developed for the identifications of differentially expressed time related genes and dynamic clusters. Deviance information criterion is applied to determine the number of components for model comparisons and selections. The proposed Bayesian models are applied to multiple tissue polygenetic temporal gene expression data and compared to a Bayesian model-based clustering method, named CAGED. Results show that our proposed Bayesian Finite Markov Mixture model can well capture the dynamic changes and patterns for irregular complex temporal data.
有限混合模型可以通过降低高维度并根据不可观测的潜在变量明确数据潜在结构的主要成分,来洞察行为模式,将其作为时程基因表达数据各种动态异质性的一个来源。基因表达动态转变过程随时间变化的潜在结构可以用马尔可夫过程来表示。本文探讨了分析大型基因表达数据集时的关键问题,这些数据集描述了同一动物的肝脏、骨骼肌和肾脏等多个组织中的系统性时间响应级联以及对治疗剂量的动态变化。开发了具有狄利克雷先验的贝叶斯有限马尔可夫混合模型,用于识别差异表达的时间相关基因和动态聚类。偏差信息准则用于确定模型比较和选择的成分数量。所提出的贝叶斯模型应用于多组织多基因时间基因表达数据,并与一种基于贝叶斯模型的聚类方法CAGED进行比较。结果表明,我们提出的贝叶斯有限马尔可夫混合模型能够很好地捕捉不规则复杂时间数据的动态变化和模式。