Lin Kuang, Husmeier Dirk
Biomathematics & Statistics Scotland (BioSS), Edinburgh, UK.
EURASIP J Bioinform Syst Biol. 2009;2009(1):601068. doi: 10.1155/2009/601068. Epub 2009 Jun 11.
Understanding the mechanisms of gene transcriptional regulation through analysis of high-throughput postgenomic data is one of the central problems of computational systems biology. Various approaches have been proposed, but most of them fail to address at least one of the following objectives: (1) allow for the fact that transcription factors are potentially subject to posttranscriptional regulation; (2) allow for the fact that transcription factors cooperate as a functional complex in regulating gene expression, and (3) provide a model and a learning algorithm with manageable computational complexity. The objective of the present study is to propose and test a method that addresses these three issues. The model we employ is a mixture of factor analyzers, in which the latent variables correspond to different transcription factors, grouped into complexes or modules. We pursue inference in a Bayesian framework, using the Variational Bayesian Expectation Maximization (VBEM) algorithm for approximate inference of the posterior distributions of the model parameters, and estimation of a lower bound on the marginal likelihood for model selection. We have evaluated the performance of the proposed method on three criteria: activity profile reconstruction, gene clustering, and network inference.
通过高通量后基因组数据分析来理解基因转录调控机制是计算系统生物学的核心问题之一。人们已经提出了各种方法,但其中大多数方法至少无法满足以下目标之一:(1)考虑到转录因子可能受到转录后调控这一事实;(2)考虑到转录因子作为功能复合物协同调节基因表达这一事实;以及(3)提供一种具有可管理计算复杂度的模型和学习算法。本研究的目的是提出并测试一种解决这三个问题的方法。我们采用的模型是因子分析器的混合模型,其中潜在变量对应于不同的转录因子,这些转录因子被分组为复合物或模块。我们在贝叶斯框架下进行推理,使用变分贝叶斯期望最大化(VBEM)算法对模型参数的后验分布进行近似推理,并估计用于模型选择的边际似然的下限。我们从三个标准评估了所提出方法的性能:活性谱重建、基因聚类和网络推理。