Laboratório de Química e Função de Proteínas e Peptídeos, Centro de Biociências e Biotecnologia, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Av. Alberto Lamego 2000, P5, sala 217, Campos dos Goytacazes, RJ, Brazil.
Planta. 2020 Nov 16;252(6):104. doi: 10.1007/s00425-020-03499-8.
We report a soybean gene co-expression network built with data from 1284 RNA-Seq experiments, which was used to identify important regulators, modules and to elucidate the fates of gene duplicates. Soybean (Glycine max (L.) Merr.) is one of the most important crops worldwide, constituting a major source of protein and edible oil. Gene co-expression networks (GCN) have been extensively used to study transcriptional regulation and evolution of genes and genomes. Here, we report a soybean GCN using 1284 publicly available RNA-Seq samples from 15 distinct tissues. We found modules that are differentially regulated in specific tissues, comprising processes such as photosynthesis, gluconeogenesis, lignin metabolism, and response to biotic stress. We identified transcription factors among intramodular hubs, which probably integrate different pathways and shape the transcriptional landscape in different conditions. The top hubs for each module tend to encode proteins with critical roles, such as succinate dehydrogenase and RNA polymerase subunits. Importantly, gene essentiality was strongly correlated with degree centrality and essential hubs were enriched in genes involved in nucleic acids metabolism and regulation of cell replication. Using a guilt-by-association approach, we predicted functions for 93 of 106 hubs without functional description in soybean. Most of the duplicated genes had different transcriptional profiles, supporting their functional divergence, although paralogs originating from whole-genome duplications (WGD) are more often preserved in the same module than those from other mechanisms. Together, our results highlight the importance of GCN analysis in unraveling key functional aspects of the soybean genome, in particular those associated with hub genes and WGD events.
我们报告了一个基于 1284 个 RNA-Seq 实验数据构建的大豆基因共表达网络,该网络用于鉴定重要的调控因子、模块,并阐明基因重复的命运。大豆(Glycine max (L.) Merr.)是全球最重要的作物之一,是蛋白质和食用油的主要来源。基因共表达网络(GCN)已被广泛用于研究基因和基因组的转录调控和进化。在这里,我们报告了一个使用 15 种不同组织中 1284 个公开可用的 RNA-Seq 样本的大豆 GCN。我们发现了在特定组织中差异调节的模块,包括光合作用、糖异生、木质素代谢和对生物胁迫的反应等过程。我们在模块内的枢纽中鉴定出转录因子,它们可能整合不同的途径,并在不同条件下塑造转录景观。每个模块的顶级枢纽往往编码具有关键作用的蛋白质,如琥珀酸脱氢酶和 RNA 聚合酶亚基。重要的是,基因的必需性与度中心性强烈相关,而必需枢纽在参与核酸代谢和细胞复制调控的基因中富集。使用关联过失的方法,我们预测了 106 个大豆中没有功能描述的枢纽中的 93 个功能。大多数重复基因具有不同的转录谱,支持它们的功能分化,尽管来自全基因组复制(WGD)的旁系同源物比来自其他机制的旁系同源物更常保留在相同的模块中。总之,我们的结果强调了 GCN 分析在揭示大豆基因组关键功能方面的重要性,特别是那些与枢纽基因和 WGD 事件相关的功能。