Systems Biology Centre, University of Warwick, Coventry, UK.
Bioinformatics. 2012 Jun 15;28(12):i233-41. doi: 10.1093/bioinformatics/bts222.
The generation of time series transcriptomic datasets collected under multiple experimental conditions has proven to be a powerful approach for disentangling complex biological processes, allowing for the reverse engineering of gene regulatory networks (GRNs). Most methods for reverse engineering GRNs from multiple datasets assume that each of the time series were generated from networks with identical topology. In this study, we outline a hierarchical, non-parametric Bayesian approach for reverse engineering GRNs using multiple time series that can be applied in a number of novel situations including: (i) where different, but overlapping sets of transcription factors are expected to bind in the different experimental conditions; that is, where switching events could potentially arise under the different treatments and (ii) for inference in evolutionary related species in which orthologous GRNs exist. More generally, the method can be used to identify context-specific regulation by leveraging time series gene expression data alongside methods that can identify putative lists of transcription factors or transcription factor targets.
The hierarchical inference outperforms related (but non-hierarchical) approaches when the networks used to generate the data were identical, and performs comparably even when the networks used to generate data were independent. The method was subsequently used alongside yeast one hybrid and microarray time series data to infer potential transcriptional switches in Arabidopsis thaliana response to stress. The results confirm previous biological studies and allow for additional insights into gene regulation under various abiotic stresses.
The methods outlined in this article have been implemented in Matlab and are available on request.
在多个实验条件下生成时间序列转录组数据集已被证明是一种强大的方法,可以分离复杂的生物过程,并允许对基因调控网络(GRN)进行反向工程。大多数从多个数据集反向工程 GRN 的方法都假设每个时间序列都是由具有相同拓扑结构的网络生成的。在这项研究中,我们概述了一种分层的、非参数贝叶斯方法,用于使用多个时间序列反向工程 GRN,可以应用于多种新情况,包括:(i)不同但重叠的转录因子集预计在不同的实验条件下结合,即潜在的开关事件可能在不同的处理条件下出现;(ii)推断进化相关物种中的直系同源 GRN。更一般地说,该方法可以通过利用时间序列基因表达数据以及可以识别潜在转录因子或转录因子靶标的方法,来识别特定于上下文的调节。
当用于生成数据的网络相同时,分层推断优于相关(但非分层)方法,即使用于生成数据的网络是独立的,它的性能也相当。该方法随后与酵母单杂交和微阵列时间序列数据一起用于推断拟南芥对胁迫的应激反应中的潜在转录开关。结果证实了先前的生物学研究,并允许在各种非生物胁迫下对基因调控进行额外的了解。
本文中概述的方法已在 Matlab 中实现,并可根据要求提供。