Shojaie Ali, Michailidis George
University of Michigan, Ann Arbor, USA.
Stat Appl Genet Mol Biol. 2010;9(1):Article22. doi: 10.2202/1544-6115.1483. Epub 2010 May 22.
Cellular functions of living organisms are carried out through complex systems of interacting components. Including such interactions in the analysis, and considering sub-systems defined by biological pathways instead of individual components (e.g. genes), can lead to new findings about complex biological mechanisms. Networks are often used to capture such interactions and can be incorporated in models to improve the efficiency in estimation and inference. In this paper, we propose a model for incorporating external information about interactions among genes (proteins/metabolites) in differential analysis of gene sets. We exploit the framework of mixed linear models and propose a flexible inference procedure for analysis of changes in biological pathways. The proposed method facilitates the analysis of complex experiments, including multiple experimental conditions and temporal correlations among observations. We propose an efficient iterative algorithm for estimation of the model parameters and show that the proposed framework is asymptotically robust to the presence of noise in the network information. The performance of the proposed model is illustrated through the analysis of gene expression data for environmental stress response (ESR) in yeast, as well as simulated data sets.
生物体的细胞功能是通过相互作用的复杂系统来实现的。在分析中纳入此类相互作用,并考虑由生物途径而非单个组件(如基因)定义的子系统,可能会带来有关复杂生物学机制的新发现。网络常被用于捕捉此类相互作用,并可纳入模型以提高估计和推断的效率。在本文中,我们提出了一个模型,用于在基因集差异分析中纳入有关基因(蛋白质/代谢物)之间相互作用的外部信息。我们利用混合线性模型的框架,提出了一种灵活的推断程序,用于分析生物途径中的变化。所提出的方法有助于对复杂实验进行分析,包括多个实验条件以及观测值之间的时间相关性。我们提出了一种有效的迭代算法来估计模型参数,并表明所提出的框架对于网络信息中噪声的存在具有渐近鲁棒性。通过对酵母环境应激反应(ESR)的基因表达数据以及模拟数据集的分析,展示了所提出模型的性能。