Max Planck Institutes Tübingen, Tübingen, Germany.
PLoS Comput Biol. 2010 May 6;6(5):e1000770. doi: 10.1371/journal.pcbi.1000770.
Gene expression measurements are influenced by a wide range of factors, such as the state of the cell, experimental conditions and variants in the sequence of regulatory regions. To understand the effect of a variable of interest, such as the genotype of a locus, it is important to account for variation that is due to confounding causes. Here, we present VBQTL, a probabilistic approach for mapping expression quantitative trait loci (eQTLs) that jointly models contributions from genotype as well as known and hidden confounding factors. VBQTL is implemented within an efficient and flexible inference framework, making it fast and tractable on large-scale problems. We compare the performance of VBQTL with alternative methods for dealing with confounding variability on eQTL mapping datasets from simulations, yeast, mouse, and human. Employing Bayesian complexity control and joint modelling is shown to result in more precise estimates of the contribution of different confounding factors resulting in additional associations to measured transcript levels compared to alternative approaches. We present a threefold larger collection of cis eQTLs than previously found in a whole-genome eQTL scan of an outbred human population. Altogether, 27% of the tested probes show a significant genetic association in cis, and we validate that the additional eQTLs are likely to be real by replicating them in different sets of individuals. Our method is the next step in the analysis of high-dimensional phenotype data, and its application has revealed insights into genetic regulation of gene expression by demonstrating more abundant cis-acting eQTLs in human than previously shown. Our software is freely available online at http://www.sanger.ac.uk/resources/software/peer/.
基因表达测量受到广泛因素的影响,例如细胞状态、实验条件和调控区域序列的变异。为了理解感兴趣变量(例如基因座的基因型)的影响,重要的是要考虑由于混杂原因导致的变异。在这里,我们提出了 VBQTL,这是一种用于映射表达数量性状基因座(eQTL)的概率方法,该方法共同对基因型以及已知和隐藏的混杂因素的贡献进行建模。VBQTL 在一个高效且灵活的推断框架内实现,使其在大规模问题上快速且易于处理。我们比较了 VBQTL 与其他方法在模拟、酵母、小鼠和人类的 eQTL 映射数据集上处理混杂可变性的性能。采用贝叶斯复杂度控制和联合建模的方法,与替代方法相比,对不同混杂因素的贡献的估计更精确,从而导致与测量转录水平相关的额外关联。与以往的外显子组人群全基因组 eQTL 扫描相比,我们提出了三倍以上的 cis-eQTL 集合。27%的测试探针在 cis 中表现出显著的遗传关联,我们通过在不同的个体集合中复制这些探针来验证这些额外的 eQTL 很可能是真实的。我们的方法是对高维表型数据进行分析的下一步,其应用通过展示人类中比以前显示出更多的顺式作用 eQTL,揭示了遗传对基因表达的调控。我们的软件可在 http://www.sanger.ac.uk/resources/software/peer/ 上免费获得。