Lovell David, Pawlowsky-Glahn Vera, Egozcue Juan José, Marguerat Samuel, Bähler Jürg
Queensland University of Technology, Brisbane, Australia.
Dept. d'Informàtica, Matemàtica Aplicada i Estadística. U. de Girona, España.
PLoS Comput Biol. 2015 Mar 16;11(3):e1004075. doi: 10.1371/journal.pcbi.1004075. eCollection 2015 Mar.
In the life sciences, many measurement methods yield only the relative abundances of different components in a sample. With such relative-or compositional-data, differential expression needs careful interpretation, and correlation-a statistical workhorse for analyzing pairwise relationships-is an inappropriate measure of association. Using yeast gene expression data we show how correlation can be misleading and present proportionality as a valid alternative for relative data. We show how the strength of proportionality between two variables can be meaningfully and interpretably described by a new statistic ϕ which can be used instead of correlation as the basis of familiar analyses and visualisation methods, including co-expression networks and clustered heatmaps. While the main aim of this study is to present proportionality as a means to analyse relative data, it also raises intriguing questions about the molecular mechanisms underlying the proportional regulation of a range of yeast genes.
在生命科学领域,许多测量方法只能得出样本中不同成分的相对丰度。对于这类相对数据或成分数据,差异表达需要谨慎解读,而相关性(分析成对关系的常用统计方法)并不是衡量关联的合适指标。我们利用酵母基因表达数据展示了相关性如何产生误导,并提出比例关系作为相对数据的有效替代方法。我们展示了如何用一个新的统计量ϕ有意义且可解释地描述两个变量之间的比例关系强度,该统计量可用于替代相关性,作为包括共表达网络和聚类热图在内的常见分析和可视化方法的基础。虽然本研究的主要目的是将比例关系作为分析相对数据的一种手段,但它也引发了关于一系列酵母基因比例调控潜在分子机制的有趣问题。