Schläpfer Pascal, Zhang Peifen, Wang Chuan, Kim Taehyong, Banf Michael, Chae Lee, Dreher Kate, Chavali Arvind K, Nilo-Poyanco Ricardo, Bernard Thomas, Kahn Daniel, Rhee Seung Y
Carnegie Institution for Science, Plant Biology Department, Stanford, California 94305 (P.S., P.Z., C.W., T.K., M.B., L.C., K.D., A.K.C., R.N.-P., S.Y.R.); and.
Laboratoire Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, Centre National de la Recherche Scientifique, Institut National de la Recherche Agronomique, Unité Mixte de Recherche 5558, 69622 Villeurbanne, France (T.B., D.K.).
Plant Physiol. 2017 Apr;173(4):2041-2059. doi: 10.1104/pp.16.01942. Epub 2017 Feb 22.
Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters.
植物代谢支撑着许多具有生态和农艺重要性的性状。植物产生大量化合物以应对其环境,但这些化合物中大多数的生物合成途径尚未阐明。为了设计和改善代谢性状,我们需要在基因组尺度上全面而准确地了解植物代谢的组织和调控。在此,我们提出了一种计算流程,用于从已测序的基因组中识别代谢酶、代谢途径和基因簇。利用这个流程,我们生成了22个物种的代谢途径数据库,并从18个物种中识别出代谢基因簇。这个统一的资源可用于开展广泛的植物代谢比较研究。利用该资源,我们发现植物中代谢基因簇广泛存在:来自18个物种的11969个基因簇。代谢基因簇的普遍存在为揭示新的代谢物生物合成途径提供了一个尚未开发的潜在来源。例如,超过1700个基因簇包含能够生成特殊代谢物支架的酶(标志性酶)和修饰该支架的酶(修饰酶)。在四个有足够基因表达数据的物种中,我们识别出43个高度共表达的基因簇,它们包含标志性酶和修饰酶,其中八个先前已被鉴定为功能途径。最后,我们确定了基因组组织模式,这些模式表明局部基因复制以及在较小程度上的单基因转座在植物代谢基因簇的进化中发挥了作用。