Centre for Computational Biology, University of Birmingham, Birmingham, United Kingdom.
Instituto de Investigaciones Biomédicas, UNAM, Ciudad de México, México.
PLoS One. 2021 Apr 1;16(4):e0247671. doi: 10.1371/journal.pone.0247671. eCollection 2021.
Transcriptomes are known to organize themselves into gene co-expression clusters or modules where groups of genes display distinct patterns of coordinated or synchronous expression across independent biological samples. The functional significance of these co-expression clusters is suggested by the fact that highly coexpressed groups of genes tend to be enriched in genes involved in common functions and biological processes. While gene co-expression is widely assumed to reflect close regulatory proximity, the validity of this assumption remains unclear. Here we use a simple synthetic gene regulatory network (GRN) model and contrast the resulting co-expression structure produced by these networks with their known regulatory architecture and with the co-expression structure measured in available human expression data. Using randomization tests, we found that the levels of co-expression observed in simulated expression data were, just as with empirical data, significantly higher than expected by chance. When examining the source of correlated expression, we found that individual regulators, both in simulated and experimental data, fail, on average, to display correlated expression with their immediate targets. However, highly correlated gene pairs tend to share at least one common regulator, while most gene pairs sharing common regulators do not necessarily display correlated expression. Our results demonstrate that widespread co-expression naturally emerges in regulatory networks, and that it is a reliable and direct indicator of active co-regulation in a given cellular context.
转录组已知会组织成基因共表达簇或模块,其中基因群在独立的生物样本中表现出明显的协调或同步表达模式。这些共表达簇的功能意义表明,高度共表达的基因群往往富集了参与共同功能和生物过程的基因。虽然基因共表达被广泛认为反映了紧密的调控接近性,但这一假设的有效性仍不清楚。在这里,我们使用一个简单的合成基因调控网络(GRN)模型,并将这些网络产生的共表达结构与它们已知的调控结构以及可用的人类表达数据中测量的共表达结构进行对比。通过随机化检验,我们发现模拟表达数据中观察到的共表达水平与经验数据一样,明显高于预期的偶然水平。当研究相关表达的来源时,我们发现单个调节剂,无论是在模拟数据还是实验数据中,平均来说,都不能与它们的直接靶标表现出相关的表达。然而,高度相关的基因对往往至少共享一个共同的调节剂,而大多数共享共同调节剂的基因对不一定表现出相关的表达。我们的结果表明,广泛的共表达在调控网络中自然出现,并且是给定细胞环境中活跃共调控的可靠且直接的指标。