Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany.
Heredity (Edinb). 2012 Oct;109(4):235-45. doi: 10.1038/hdy.2012.35. Epub 2012 Jul 18.
Accurate and fast estimation of genetic parameters that underlie quantitative traits using mixed linear models with additive and dominance effects is of great importance in both natural and breeding populations. Here, we propose a new fast adaptive Markov chain Monte Carlo (MCMC) sampling algorithm for the estimation of genetic parameters in the linear mixed model with several random effects. In the learning phase of our algorithm, we use the hybrid Gibbs sampler to learn the covariance structure of the variance components. In the second phase of the algorithm, we use this covariance structure to formulate an effective proposal distribution for a Metropolis-Hastings algorithm, which uses a likelihood function in which the random effects have been integrated out. Compared with the hybrid Gibbs sampler, the new algorithm had better mixing properties and was approximately twice as fast to run. Our new algorithm was able to detect different modes in the posterior distribution. In addition, the posterior mode estimates from the adaptive MCMC method were close to the REML (residual maximum likelihood) estimates. Moreover, our exponential prior for inverse variance components was vague and enabled the estimated mode of the posterior variance to be practically zero, which was in agreement with the support from the likelihood (in the case of no dominance). The method performance is illustrated using simulated data sets with replicates and field data in barley.
使用具有加性和显性效应的混合线性模型准确快速地估计数量性状的遗传参数,对于自然和选育群体都非常重要。在这里,我们提出了一种新的快速自适应马尔可夫链蒙特卡罗(MCMC)抽样算法,用于估计具有多个随机效应的线性混合模型中的遗传参数。在我们算法的学习阶段,我们使用混合 Gibbs 抽样器来学习方差分量的协方差结构。在算法的第二阶段,我们使用该协方差结构来构建 Metropolis-Hastings 算法的有效提议分布,该算法使用了已集成随机效应的似然函数。与混合 Gibbs 抽样器相比,新算法具有更好的混合特性,运行速度快约两倍。我们的新算法能够检测到后验分布中的不同模式。此外,来自自适应 MCMC 方法的后验模式估计值接近 REML(残差最大似然)估计值。此外,我们对逆方差分量的指数先验是模糊的,这使得后验方差的估计模式实际上为零,这与似然的支持(在没有显性的情况下)一致。该方法的性能通过使用具有重复项的模拟数据集和大麦的田间数据进行了说明。