Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, 3083, Australia.
School of Applied Systems Biology, La Trobe University, Bundoora, VIC, 3083, Australia.
Commun Biol. 2022 Jul 5;5(1):661. doi: 10.1038/s42003-022-03624-1.
Bayesian methods, such as BayesR, for predicting the genetic value or risk of individuals from their genotypes, such as Single Nucleotide Polymorphisms (SNP), are often implemented using a Markov Chain Monte Carlo (MCMC) process. However, the generation of Markov chains is computationally slow. We introduce a form of blocked Gibbs sampling for estimating SNP effects from Markov chains that greatly reduces computational time by sampling each SNP effect iteratively n-times from conditional block posteriors. Subsequent iteration over all blocks m-times produces chains of length m × n. We use this strategy to solve large-scale genomic prediction and fine-mapping problems using the Bayesian MCMC mixed-effects genetic model, BayesR3. We validate the method using simulated data, followed by analysis of empirical dairy cattle data using high dimension milk mid infra-red spectra data as an example of "omics" data and show its use to increase the precision of mapping variants affecting milk, fat, and protein yields relative to a univariate analysis of milk, fat, and protein.
贝叶斯方法,如 BayesR,用于根据个体的基因型(如单核苷酸多态性(SNP))预测其遗传值或风险,通常使用马尔可夫链蒙特卡罗(MCMC)过程来实现。然而,生成马尔可夫链的计算速度很慢。我们引入了一种从马尔可夫链中估计 SNP 效应的分块 Gibbs 抽样形式,通过从条件块后验中迭代地对每个 SNP 效应进行 n 次抽样,大大减少了计算时间。对所有块进行 m 次迭代,会产生长度为 m×n 的链。我们使用这种策略来解决大规模基因组预测和精细映射问题,使用贝叶斯 MCMC 混合效应遗传模型 BayesR3。我们使用模拟数据验证该方法,然后使用高维牛奶中红外光谱数据分析实际奶牛数据,以此为例展示“组学”数据,并展示其在提高影响牛奶、脂肪和蛋白质产量的变异体映射精度方面的用途,相对于牛奶、脂肪和蛋白质的单变量分析。