Lee Chaeyoung
Department of Bioinformatics and Life Science, Soongsil University, Seoul, South Korea.
Front Genet. 2019 Mar 22;10:199. doi: 10.3389/fgene.2019.00199. eCollection 2019.
The importance of expression quantitative trait locus (eQTL) has been emphasized in understanding the genetic basis of cellular activities and complex phenotypes. Mixed models can be employed to effectively identify eQTLs by explaining polygenic effects. In these mixed models, the polygenic effects are considered as random variables, and their variability is explained by the polygenic variance component. The polygenic and residual variance components are first estimated, and then eQTL effects are estimated depending on the variance component estimates within the frequentist mixed model framework. The Bayesian approach to the mixed model-based genome-wide eQTL analysis can also be applied to estimate the parameters that exhibit various benefits. Bayesian inferences on unknown parameters are based on their marginal posterior distributions, and the marginalization of the joint posterior distribution is a challenging task. This problem can be solved by employing a numerical algorithm of integrals called Gibbs sampling as a Markov chain Monte Carlo. This article reviews the mixed model-based Bayesian eQTL analysis by Gibbs sampling. Theoretical and practical issues of Bayesian inference are discussed using a concise description of Bayesian modeling and the corresponding Gibbs sampling. The strengths of Bayesian inference are also discussed. Posterior probability distribution in the Bayesian inference reflects uncertainty in unknown parameters. This factor is useful in the context of eQTL analysis where a sample size is too small to apply the frequentist approach. Bayesian inference based on the posterior that reflects prior knowledge, will be increasingly preferred with the accumulation of eQTL data. Extensive use of the mixed model-based Bayesian eQTL analysis will accelerate understanding of eQTLs exhibiting various regulatory functions.
表达数量性状基因座(eQTL)在理解细胞活动和复杂表型的遗传基础方面的重要性已得到强调。混合模型可通过解释多基因效应来有效地识别eQTL。在这些混合模型中,多基因效应被视为随机变量,其变异性由多基因方差分量来解释。首先估计多基因和残差方差分量,然后在频率主义混合模型框架内根据方差分量估计值来估计eQTL效应。基于混合模型的全基因组eQTL分析的贝叶斯方法也可用于估计具有各种优势的参数。对未知参数的贝叶斯推断基于其边际后验分布,而联合后验分布的边缘化是一项具有挑战性的任务。这个问题可以通过采用一种称为吉布斯采样的积分数值算法作为马尔可夫链蒙特卡罗方法来解决。本文回顾了通过吉布斯采样进行的基于混合模型的贝叶斯eQTL分析。通过对贝叶斯建模和相应吉布斯采样的简要描述,讨论了贝叶斯推断的理论和实际问题。还讨论了贝叶斯推断的优势。贝叶斯推断中的后验概率分布反映了未知参数的不确定性。在样本量太小而无法应用频率主义方法的eQTL分析背景下,这个因素很有用。基于反映先验知识的后验的贝叶斯推断,将随着eQTL数据的积累而越来越受到青睐。广泛使用基于混合模型的贝叶斯eQTL分析将加速对具有各种调控功能的eQTL的理解。