Department of Statistics and Department of Human Genetics, University of Chicago, Chicago, Illinois, USA.
PLoS One. 2013 Jul 5;8(7):e65245. doi: 10.1371/journal.pone.0065245. Print 2013.
We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations - that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5-10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data.
我们考虑了评估多个相关结果变量与单个感兴趣的解释变量之间关联的问题。这种问题出现在许多情况下,包括遗传关联研究,其中解释变量是遗传变异的基因型。我们基于贝叶斯模型比较和多元回归的模型平均,概述了进行这种类型分析的框架。该框架统一了几种常见的方法,并包括标准的单变量和多变量关联测试作为特例。该框架还统一了检验关联和解释关联的问题,即确定哪些结果变量与基因型相关。这为解释和解释多变量发现时诉诸单变量测试提供了一种替代方法,这在概念上是令人不满意的。对于适度数量的表型(例如 5-10),该方法在计算上是可行的,并且可以应用于汇总数据,而无需访问原始基因型和表型数据。我们在模拟示例和全基因组关联研究中对这些方法进行了说明,在全基因组关联研究中,我们确定了 18 个潜在的新遗传关联,这些关联未通过对相同数据的单变量分析识别出来。