Grzegorczyk Marco, Aderhold Andrej, Husmeier Dirk
Stat Appl Genet Mol Biol. 2015 Apr;14(2):143-67. doi: 10.1515/sagmb-2014-0041.
There has been much interest in reconstructing bi-directional regulatory networks linking the circadian clock to metabolism in plants. A variety of reverse engineering methods from machine learning and computational statistics have been proposed and evaluated. The emphasis of the present paper is on combining models in a model ensemble to boost the network reconstruction accuracy, and to explore various model combination strategies to maximize the improvement. Our results demonstrate that a rich ensemble of predictors outperforms the best individual model, even if the ensemble includes poor predictors with inferior individual reconstruction accuracy. For our application to metabolomic and transcriptomic time series from various mutagenesis plants grown in different light-dark cycles we also show how to determine the optimal time lag between interactions, and we identify significant interactions with a randomization test. Our study predicts new statistically significant interactions between circadian clock genes and metabolites in Arabidopsis thaliana, and thus provides independent statistical evidence that the regulation of metabolism by the circadian clock is not uni-directional, but that there is a statistically significant feedback mechanism aiming from metabolism back to the circadian clock.
人们对构建连接植物生物钟与新陈代谢的双向调控网络有着浓厚兴趣。已经提出并评估了多种来自机器学习和计算统计学的逆向工程方法。本文重点在于将模型组合成一个模型集成,以提高网络重建的准确性,并探索各种模型组合策略以实现最大程度的改进。我们的结果表明,即使该集成包含个体重建准确性较差的预测器,丰富的预测器集成仍优于最佳的单个模型。对于我们在不同光暗周期下生长的各种诱变植物的代谢组学和转录组学时间序列的应用,我们还展示了如何确定相互作用之间的最佳时间滞后,并通过随机化检验识别出显著的相互作用。我们的研究预测了拟南芥生物钟基因与代谢物之间新的具有统计学意义的相互作用,从而提供了独立的统计证据,表明生物钟对代谢的调节不是单向的,而是存在从代谢反馈到生物钟的具有统计学意义的反馈机制。