Scott Chialvo Clare H, Che Ronglin, Reif David, Motsinger-Reif Alison, Reed Laura K
Department of Biological Sciences, University of Alabama, Box 870344, Tuscaloosa, AL 35487, USA.
Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA.
Metabolomics. 2016 Nov;12(11). doi: 10.1007/s11306-016-1117-3. Epub 2016 Sep 20.
'Multi-omics' datasets obtained from an organism of interest reared under different environmental treatments are increasingly common. Identifying the links among metabolites and transcripts can help to elucidate our understanding of the impact of environment at different levels within the organism. However, many methods for characterizing physiological connections cannot address unidentified metabolites.
Here, we use Eigenvector Metabolite Analysis (EvMA) to examine links between metabolomic, transcriptomic, and phenotypic variation data and to assess the impact of environmental factors on these associations. Unlike other methods, EvMA can be used to analyze datasets that include unidentified metabolites and unannotated transcripts.
To demonstrate the utility of EvMA, we analyzed metabolomic, transcriptomic, and phenotypic datasets produced from 20 genotypes reared on four dietary treatments. We used a hierarchical distance-based method to cluster the metabolites. The links between metabolite clusters, gene expression, and overt phenotypes were characterized using the eigenmetabolite (first principal component) of each cluster.
EvMA recovered chemically related groups of metabolites within the clusters. Using the eigenmetabolite, we identified genes and phenotypes that significantly correlated with each cluster. EvMA identifies new connections between the phenotypes, metabolites, and gene transcripts. EvMA provides a simple method to identify correlations between metabolites, gene expression, and phenotypes, which can allow us to partition multivariate datasets into meaningful biological modules and identify under-studied metabolites and unannotated gene transcripts that may be central to important biological processes. This can be used to inform our understanding of the effect of environmental mechanisms underlying physiological states of interest.
从在不同环境处理下饲养的目标生物体中获得的“多组学”数据集越来越普遍。识别代谢物和转录本之间的联系有助于阐明我们对生物体不同水平环境影响的理解。然而,许多表征生理联系的方法无法处理未鉴定的代谢物。
在这里,我们使用特征向量代谢物分析(EvMA)来研究代谢组学、转录组学和表型变异数据之间的联系,并评估环境因素对这些关联的影响。与其他方法不同,EvMA可用于分析包含未鉴定代谢物和未注释转录本的数据集。
为了证明EvMA的实用性,我们分析了在四种饮食处理下饲养的20种基因型产生的代谢组学、转录组学和表型数据集。我们使用基于层次距离的方法对代谢物进行聚类。使用每个聚类的特征代谢物(第一主成分)来表征代谢物聚类、基因表达和明显表型之间的联系。
EvMA在聚类中恢复了化学相关的代谢物组。使用特征代谢物,我们鉴定了与每个聚类显著相关的基因和表型。EvMA识别了表型、代谢物和基因转录本之间的新联系。EvMA提供了一种简单的方法来识别代谢物、基因表达和表型之间的相关性,这可以使我们将多变量数据集划分为有意义的生物学模块,并识别可能对重要生物学过程至关重要但研究不足的代谢物和未注释的基因转录本。这可用于帮助我们理解感兴趣的生理状态背后的环境机制的影响。