Castro Juan Camilo, Valdés Ivan, Gonzalez-García Laura Natalia, Danies Giovanna, Cañas Silvia, Winck Flavia Vischi, Ñústez Carlos Eduardo, Restrepo Silvia, Riaño-Pachón Diego Mauricio
Department of Biological Sciences, Universidad de los Andes, Bogotá D.C, Colombia.
Department of Design, Universidad de los Andes, Bogotá D.C, Colombia.
Theor Biol Med Model. 2019 Apr 9;16(1):7. doi: 10.1186/s12976-019-0103-7.
The increasing amounts of genomics data have helped in the understanding of the molecular dynamics of complex systems such as plant and animal diseases. However, transcriptional regulation, although playing a central role in the decision-making process of cellular systems, is still poorly understood. In this study, we linked expression data with mathematical models to infer gene regulatory networks (GRN). We present a simple yet effective method to estimate transcription factors' GRNs from transcriptional data.
We defined interactions between pairs of genes (edges in the GRN) as the partial mutual information between these genes that takes into account time and possible lags in time from one gene in relation to another. We call this method Gene Regulatory Networks on Transfer Entropy (GRNTE) and it corresponds to Granger causality for Gaussian variables in an autoregressive model. To evaluate the reconstruction accuracy of our method, we generated several sub-networks from the GRN of the eukaryotic yeast model, Saccharomyces cerevisae. Then, we applied this method using experimental data of the plant pathogen Phytophthora infestans. We evaluated the transcriptional expression levels of 48 transcription factors of P. infestans during its interaction with one moderately resistant and one susceptible cultivar of yellow potato (Solanum tuberosum group Phureja), using RT-qPCR. With these data, we reconstructed the regulatory network of P. infestans during its interaction with these hosts.
We first evaluated the performance of our method, based on the transfer entropy (GRNTE), on eukaryotic datasets from the GRNs of the yeast S. cerevisae. Results suggest that GRNTE is comparable with the state-of-the-art methods when the parameters for edge detection are properly tuned. In the case of P. infestans, most of the genes considered in this study, showed a significant change in expression from the onset of the interaction (0 h post inoculum - hpi) to the later time-points post inoculation. Hierarchical clustering of the expression data discriminated two distinct periods during the infection: from 12 to 36 hpi and from 48 to 72 hpi for both the moderately resistant and susceptible cultivars. These distinct periods could be associated with two phases of the life cycle of the pathogen when infecting the host plant: the biotrophic and necrotrophic phases.
Here we presented an algorithmic solution to the problem of network reconstruction in time series data. This analytical perspective makes use of the dynamic nature of time series data as it relates to intrinsically dynamic processes such as transcription regulation, were multiple elements of the cell (e.g., transcription factors) act simultaneously and change over time. We applied the algorithm to study the regulatory network of P. infestans during its interaction with two hosts which differ in their level of resistance to the pathogen. Although the gene expression analysis did not show differences between the two hosts, the results of the GRN analyses evidenced rewiring of the genes' interactions according to the resistance level of the host. This suggests that different regulatory processes are activated in response to different environmental cues. Applications of our methodology showed that it could reliably predict where to place edges in the transcriptional networks and sub-networks. The experimental approach used here can help provide insights on the biological role of these interactions on complex processes such as pathogenicity. The code used is available at https://github.com/jccastrog/GRNTE under GNU general public license 3.0.
基因组学数据量的不断增加有助于理解植物和动物疾病等复杂系统的分子动力学。然而,转录调控虽然在细胞系统的决策过程中起着核心作用,但仍未得到充分理解。在本研究中,我们将表达数据与数学模型相联系以推断基因调控网络(GRN)。我们提出了一种简单而有效的方法,可从转录数据估计转录因子的基因调控网络。
我们将基因对之间的相互作用(基因调控网络中的边)定义为这些基因之间的部分互信息,其中考虑了时间以及一个基因相对于另一个基因可能存在的时间滞后。我们将此方法称为基于转移熵的基因调控网络(GRNTE),它对应于自回归模型中高斯变量的格兰杰因果关系。为评估我们方法的重建准确性,我们从真核酵母模型酿酒酵母的基因调控网络中生成了几个子网。然后,我们将此方法应用于植物病原体致病疫霉的实验数据。我们使用逆转录定量聚合酶链反应(RT-qPCR)评估了致病疫霉在与一个中度抗病和一个感病的黄土豆(马铃薯组Phureja)品种相互作用期间48个转录因子的转录表达水平。利用这些数据,我们重建了致病疫霉在与这些宿主相互作用期间的调控网络。
我们首先基于转移熵(GRNTE)评估了我们的方法在酿酒酵母基因调控网络的真核数据集上的性能。结果表明,当边缘检测参数调整适当时,GRNTE与现有最佳方法相当。对于致病疫霉,本研究中考虑的大多数基因在相互作用开始(接种后0小时 - hpi)到接种后后期时间点的表达均有显著变化。表达数据的层次聚类区分了感染期间的两个不同时期:对于中度抗病和感病品种,分别为接种后12至36小时和48至72小时。这些不同时期可能与病原体感染宿主植物时生命周期的两个阶段相关:活体营养阶段和死体营养阶段。
在此,我们提出了一种针对时间序列数据中网络重建问题的算法解决方案。这种分析视角利用了时间序列数据的动态性质,因为它与诸如转录调控等内在动态过程相关,在转录调控过程中细胞的多个元件(例如转录因子)同时起作用并随时间变化。我们将该算法应用于研究致病疫霉在与两种对病原体抗性水平不同的宿主相互作用期间的调控网络。尽管基因表达分析未显示出两种宿主之间的差异,但基因调控网络分析结果证明基因相互作用根据宿主的抗性水平进行了重新布线。这表明不同的调控过程会根据不同的环境线索被激活。我们方法的应用表明它可以可靠地预测转录网络和子网中边的位置。此处使用的实验方法有助于深入了解这些相互作用在致病性等复杂过程中的生物学作用。所使用的代码可在https://github.com/jccastrog/GRNTE上获取,遵循GNU通用公共许可证3.0。