Wienkoop Stefanie, Morgenthal Katja, Wolschin Florian, Scholz Matthias, Selbig Joachim, Weckwerth Wolfram
Max Planck Institute of Molecular Plant Physiology, 14424 Potsdam, Germany.
Mol Cell Proteomics. 2008 Sep;7(9):1725-36. doi: 10.1074/mcp.M700273-MCP200. Epub 2008 Apr 28.
Statistical mining and integration of complex molecular data including metabolites, proteins, and transcripts is one of the critical goals of systems biology (Ideker, T., Galitski, T., and Hood, L. (2001) A new approach to decoding life: systems biology. Annu. Rev. Genomics Hum. Genet. 2, 343-372). A number of studies have demonstrated the parallel analysis of metabolites and large scale transcript expression. Protein analysis has been ignored in these studies, although a clear correlation between transcript and protein levels is shown only in rare cases, necessitating that actual protein levels have to be determined for protein function analysis. Here, we present an approach to investigate the combined covariance structure of metabolite and protein dynamics in a systemic response to abiotic temperature stress in Arabidopsis thaliana wild-type and a corresponding starch-deficient mutant (phosphoglucomutase-deficient). Independent component analysis revealed phenotype classification resolving genotype-dependent response effects to temperature treatment and genotype-independent general temperature compensation mechanisms. An observation is the stress-induced increase of raffinose-family-oligosaccharide levels in the absence of transitory starch storage/mobilization in temperature-treated phosphoglucomutase plants indicating that sucrose synthesis and storage in these mutant plants is sufficient to bypass the typical starch storage/mobilization pathways under abiotic stress. Eventually, sample pattern recognition and correlation network topology analysis allowed for the detection of specific metabolite-protein co-regulation and assignment of a circadian output regulated RNA-binding protein to these processes. The whole concept of high-dimensional profiling data integration from many replicates, subsequent multivariate statistics for dimensionality reduction, and covariance structure analysis is proposed to be a major strategy for revealing central responses of the biological system under study.
对包括代谢物、蛋白质和转录本在内的复杂分子数据进行统计挖掘和整合是系统生物学的关键目标之一(Ideker, T., Galitski, T., and Hood, L. (2001) 一种解码生命的新方法:系统生物学。《基因组学与人类遗传学年度评论》2, 343 - 372)。许多研究已经证明了对代谢物和大规模转录本表达的并行分析。在这些研究中,蛋白质分析被忽视了,尽管转录本和蛋白质水平之间的明显相关性仅在极少数情况下才会出现,因此必须确定实际的蛋白质水平以进行蛋白质功能分析。在这里,我们提出了一种方法,用于研究拟南芥野生型和相应的淀粉缺陷突变体(磷酸葡萄糖变位酶缺陷型)在非生物温度胁迫的系统响应中代谢物和蛋白质动态的联合协方差结构。独立成分分析揭示了表型分类,解析了基因型对温度处理的依赖性响应效应和与基因型无关的一般温度补偿机制。一个观察结果是,在温度处理的磷酸葡萄糖变位酶植物中,在没有瞬时淀粉储存/动员的情况下,棉子糖家族寡糖水平因胁迫而增加,这表明这些突变体植物中的蔗糖合成和储存足以在非生物胁迫下绕过典型的淀粉储存/动员途径。最终,样本模式识别和相关网络拓扑分析允许检测特定的代谢物 - 蛋白质共调节,并将一种受昼夜节律输出调节的RNA结合蛋白分配到这些过程中。我们提出,从多个重复样本进行高维分析数据整合、随后进行降维的多变量统计以及协方差结构分析的整个概念,是揭示所研究生物系统核心响应的主要策略。