Department of Computer Science, Warwick University, Coventry CV4 7AL, UK.
BMC Bioinformatics. 2010 Jan 30;11:68. doi: 10.1186/1471-2105-11-68.
Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not.
A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system.
Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits.
时间进程的微阵列实验可以产生有用的数据,这有助于理解系统的基本动态。聚类是微阵列数据分析的一个重要阶段,根据某些特征将数据分组。大多数聚类技术都是基于距离或可视化相似度度量的,但是这些方法可能不适用于时间微阵列数据的聚类,因为时间的顺序性质很重要。我们提出了一种基于格兰杰因果关系的技术来聚类时间微阵列基因表达数据,该技术通过统计检验一个时间序列是否可以用于预测另一个时间序列来衡量两个时间序列之间的相互依赖性。
通过使用格兰杰因果检验测试基因对之间的时间关系,构建了一个基因关联矩阵。关联矩阵进一步使用图论技术进行分析,以检测代表有趣生物学模块的高度连接组件。我们在合成数据集和拟南芥获得的真实生物学数据集上测试了我们的方法。我们通过使用现有生物学文献分析结果来展示我们方法的有效性。我们还报告了关联网络的有趣结构特性,这是任何生物系统通常都需要的。
我们在合成和真实微阵列数据集上的实验表明,我们的方法产生了令人鼓舞的结果。该方法在实施上简单,并且在每个步骤都具有统计学可追踪性。该方法可以产生功能相关的基因集,可进一步用于基因电路的反向工程。