RIKEN Plant Science Center, Yokohama, Japan.
Bioinformatics. 2011 Jul 1;27(13):i357-65. doi: 10.1093/bioinformatics/btr231.
Studying the interplay between gene expression and metabolite levels can yield important information on the physiology of stress responses and adaptation strategies. Performing transcriptomics and metabolomics in parallel during time-series experiments represents a systematic way to gain such information. Several combined profiling datasets have been added to the public domain and they form a valuable resource for hypothesis generating studies. Unfortunately, detecting coresponses between transcript levels and metabolite abundances is non-trivial: they cannot be assumed to overlap directly with underlying biochemical pathways and they may be subject to time delays and obscured by considerable noise.
Our aim was to predict pathway comemberships between metabolites and genes based on their coresponses to applied stress. We found that in the presence of strong noise and time-shifted responses, a hidden Markov model-based similarity outperforms the simpler Pearson correlation but performs comparably or worse in their absence. Therefore, we propose a supervised method that applies pathway information to summarize similarity statistics to a consensus statistic that is more informative than any of the single measures. Using four combined profiling datasets, we show that comembership between metabolites and genes can be predicted for numerous KEGG pathways; this opens opportunities for the detection of transcriptionally regulated pathways and novel metabolically related genes.
A command-line software tool is available at http://www.cin.ufpe.br/~igcf/Metabolites.
研究基因表达和代谢物水平之间的相互作用可以为应激反应和适应策略的生理学提供重要信息。在时间序列实验中同时进行转录组学和代谢组学研究是获得此类信息的系统方法。几个联合分析数据集已被添加到公共领域,它们为产生假说的研究提供了有价值的资源。不幸的是,检测转录水平和代谢物丰度之间的共响应并非易事:它们不能直接假定与潜在的生化途径重叠,并且可能受到时间延迟和大量噪声的影响。
我们的目标是根据它们对应用应激的共响应来预测代谢物和基因之间的途径共成员关系。我们发现,在存在强噪声和时间移位响应的情况下,基于隐马尔可夫模型的相似性优于简单的皮尔逊相关性,但在不存在这些情况时,其性能相当或更差。因此,我们提出了一种有监督的方法,该方法将途径信息应用于汇总相似性统计信息,以获得比任何单个度量更具信息量的共识统计信息。使用四个联合分析数据集,我们表明可以预测许多 KEGG 途径中代谢物和基因之间的共成员关系;这为检测转录调控途径和新的代谢相关基因提供了机会。
一个命令行软件工具可在 http://www.cin.ufpe.br/~igcf/Metabolites 获得。