Wisecaver Jennifer H, Borowsky Alexander T, Tzin Vered, Jander Georg, Kliebenstein Daniel J, Rokas Antonis
Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235.
French Associates Institute for Agriculture and Biotechnology of Drylands, Jacob Blaustein Institute for Desert Research, Ben Gurion University, Sede-Boqer Campus 84990, Israel.
Plant Cell. 2017 May;29(5):944-959. doi: 10.1105/tpc.17.00009. Epub 2017 Apr 13.
Plants produce diverse specialized metabolites (SMs), but the genes responsible for their production and regulation remain largely unknown, hindering efforts to tap plant pharmacopeia. Given that genes comprising SM pathways exhibit environmentally dependent coregulation, we hypothesized that genes within a SM pathway would form tight associations (modules) with each other in coexpression networks, facilitating their identification. To evaluate this hypothesis, we used 10 global coexpression data sets, each a meta-analysis of hundreds to thousands of experiments, across eight plant species to identify hundreds of coexpressed gene modules per data set. In support of our hypothesis, 15.3 to 52.6% of modules contained two or more known SM biosynthetic genes, and module genes were enriched in SM functions. Moreover, modules recovered many experimentally validated SM pathways, including all six known to form biosynthetic gene clusters (BGCs). In contrast, bioinformatically predicted BGCs (i.e., those lacking an associated metabolite) were no more coexpressed than the null distribution for neighboring genes. These results suggest that most predicted plant BGCs are not genuine SM pathways and argue that BGCs are not a hallmark of plant specialized metabolism. We submit that global gene coexpression is a rich, largely untapped resource for discovering the genetic basis and architecture of plant natural products.
植物产生多种特殊代谢产物(SMs),但其产生和调控所涉及的基因仍大多未知,这阻碍了挖掘植物药典的努力。鉴于构成SM途径的基因表现出环境依赖性的共调控,我们推测SM途径中的基因在共表达网络中会彼此形成紧密关联(模块),从而便于它们的识别。为了评估这一假设,我们使用了10个全局共表达数据集,每个数据集都是对数百至数千个实验的荟萃分析,涵盖8种植物,以在每个数据集中识别数百个共表达基因模块。支持我们的假设的是,15.3%至52.6%的模块包含两个或更多已知的SM生物合成基因,并且模块基因在SM功能方面富集。此外,模块恢复了许多经实验验证的SM途径,包括已知形成生物合成基因簇(BGCs)的所有六个途径。相比之下,生物信息学预测的BGCs(即那些缺乏相关代谢物的BGCs)的共表达程度并不高于相邻基因的零分布。这些结果表明,大多数预测的植物BGCs并非真正的SM途径,并表明BGCs并非植物特殊代谢的标志。我们认为,全局基因共表达是发现植物天然产物遗传基础和结构的丰富但基本未开发的资源。