Centre for Computational System Biology, Shanghai, Fudan University, Shanghai, People's Republic of China.
PLoS One. 2009;4(4):e5098. doi: 10.1371/journal.pone.0005098. Epub 2009 Apr 6.
We present a novel and systematic approach to analyze temporal microarray data. The approach includes normalization, clustering and network analysis of genes.
Genes are normalized using an error model based uniform normalization method aimed at identifying and estimating the sources of variations. The model minimizes the correlation among error terms across replicates. The normalized gene expressions are then clustered in terms of their power spectrum density. The method of complex Granger causality is introduced to reveal interactions between sets of genes. Complex Granger causality along with partial Granger causality is applied in both time and frequency domains to selected as well as all the genes to reveal the interesting networks of interactions. The approach is successfully applied to Arabidopsis leaf microarray data generated from 31,000 genes observed over 22 time points over 22 days. Three circuits: a circadian gene circuit, an ethylene circuit and a new global circuit showing a hierarchical structure to determine the initiators of leaf senescence are analyzed in detail.
We use a totally data-driven approach to form biological hypothesis. Clustering using the power-spectrum analysis helps us identify genes of potential interest. Their dynamics can be captured accurately in the time and frequency domain using the methods of complex and partial Granger causality. With the rise in availability of temporal microarray data, such methods can be useful tools in uncovering the hidden biological interactions. We show our method in a step by step manner with help of toy models as well as a real biological dataset. We also analyse three distinct gene circuits of potential interest to Arabidopsis researchers.
我们提出了一种新颖而系统的方法来分析时间性微阵列数据。该方法包括基因的归一化、聚类和网络分析。
使用基于误差模型的统一归一化方法对基因进行归一化,旨在识别和估计变异的来源。该模型最小化了复制之间误差项的相关性。然后根据其功率谱密度对归一化后的基因表达进行聚类。引入复杂格兰杰因果关系的方法来揭示基因集之间的相互作用。复杂格兰杰因果关系以及部分格兰杰因果关系应用于时间和频率域,以选择以及所有基因来揭示有趣的相互作用网络。该方法成功应用于拟南芥叶片微阵列数据,该数据来自 31000 个基因,在 22 天的 22 个时间点上观察。分析了三个电路:一个昼夜节律基因电路、一个乙烯电路和一个新的全局电路,它们显示出一种层次结构,以确定叶片衰老的启动子。
我们使用完全数据驱动的方法来形成生物学假设。使用功率谱分析进行聚类有助于我们识别潜在感兴趣的基因。使用复杂和部分格兰杰因果关系的方法可以在时间和频率域中准确地捕捉它们的动态。随着时间性微阵列数据的可用性的增加,这种方法可以成为揭示隐藏的生物学相互作用的有用工具。我们通过帮助玩具模型以及真实的生物学数据集,以逐步的方式展示我们的方法。我们还分析了三个潜在的拟南芥研究人员感兴趣的不同基因电路。