Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 60439, USA.
Computation Institute, The University of Chicago, Chicago, IL, 60637, USA.
Plant J. 2018 Sep;95(6):1102-1113. doi: 10.1111/tpj.14003. Epub 2018 Aug 9.
Genome-scale metabolic reconstructions help us to understand and engineer metabolism. Next-generation sequencing technologies are delivering genomes and transcriptomes for an ever-widening range of plants. While such omic data can, in principle, be used to compare metabolic reconstructions in different species, organs and environmental conditions, these comparisons require a standardized framework for the reconstruction of metabolic networks from transcript data. We previously introduced PlantSEED as a framework covering primary metabolism for 10 species. We have now expanded PlantSEED to include 39 species and provide tools that enable automated annotation and metabolic reconstruction from transcriptome data. The algorithm for automated annotation in PlantSEED propagates annotations using a set of signature k-mers (short amino acid sequences characteristic of particular proteins) that identify metabolic enzymes with an accuracy of about 97%. PlantSEED reconstructions are built from a curated template that includes consistent compartmentalization for more than 100 primary metabolic subsystems. Together, the annotation and reconstruction algorithms produce reconstructions without gaps and with more accurate compartmentalization than existing resources. These tools are available via the PlantSEED web interface at http://modelseed.org, which enables users to upload, annotate and reconstruct from private transcript data and simulate metabolic activity under various conditions using flux balance analysis. We demonstrate the ability to compare these metabolic reconstructions with a case study involving growth on several nitrogen sources in roots of four species.
基因组规模的代谢重建有助于我们理解和设计代谢。下一代测序技术正在为越来越广泛的植物提供基因组和转录组数据。虽然这些组学数据原则上可以用于比较不同物种、器官和环境条件下的代谢重建,但这些比较需要一个标准化的框架,用于从转录数据重建代谢网络。我们之前介绍了 PlantSEED,这是一个涵盖 10 个物种的初级代谢的框架。现在,我们已经将 PlantSEED 扩展到 39 个物种,并提供了工具,使从转录组数据进行自动注释和代谢重建成为可能。PlantSEED 中的自动注释算法使用一组特征 k-mers(特定蛋白质特征的短氨基酸序列)来传播注释,其识别代谢酶的准确率约为 97%。PlantSEED 重建是从一个经过精心整理的模板中构建的,其中包括 100 多个主要代谢子系统的一致区室化。注释和重建算法一起生成没有间隙且区室化更准确的重建,优于现有资源。这些工具可通过 PlantSEED 网络界面 http://modelseed.org 使用,用户可以上传、注释和重建私人转录数据,并使用通量平衡分析模拟各种条件下的代谢活性。我们通过一个涉及四种植物根中几种氮源生长的案例研究展示了进行这些代谢重建比较的能力。