Center for Quantitative Genetics and Genomics, Aarhus University, 8830, Tjele, Denmark.
Heredity (Edinb). 2020 Feb;124(2):274-287. doi: 10.1038/s41437-019-0273-4. Epub 2019 Oct 22.
Widely used genomic prediction models may not properly account for heterogeneous (co)variance structure across the genome. Models such as BayesA and BayesB assume locus-specific variance, which are highly influenced by the prior for (co)variance of single nucleotide polymorphism (SNP) effect, regardless of the size of data. Models such as BayesC or GBLUP assume a common (co)variance for a proportion (BayesC) or all (GBLUP) of the SNP effects. In this study, we propose a multi-trait Bayesian whole genome regression method (BayesN0), which is based on grouping a number of predefined SNPs to account for heterogeneous (co)variance structure across the genome. This model was also implemented in single-step Bayesian regression (ssBayesN0). For practical implementation, we considered multi-trait single-step SNPBLUP models, using (co)variance estimates from BayesN0 or ssBayesN0. Genotype data were simulated using haplotypes on first five chromosomes of 2200 Danish Holstein cattle, and phenotypes were simulated for two traits with heritabilities 0.1 or 0.4, assuming 200 quantitative trait loci (QTL). We compared prediction accuracy from different prediction models and different region sizes (one SNP, 100 SNPs, one chromosome or whole genome). In general, highest accuracies were obtained when 100 adjacent SNPs were grouped together. The ssBayesN0 improved accuracies over BayesN0, and using (co)variance estimates from ssBayesN0 generally yielded higher accuracies than using (co)variance estimates from BayesN0, for the 100 SNPs region size. Our results suggest that it could be a good strategy to estimate (co)variance components from ssBayesN0, and then to use those estimates in genomic prediction using multi-trait single-step SNPBLUP, in routine genomic evaluations.
广泛使用的基因组预测模型可能无法正确考虑基因组中异质(协)方差结构。BayesA 和 BayesB 等模型假设特定于基因座的方差,这些方差受单核苷酸多态性(SNP)效应协方差先验的高度影响,而与数据的大小无关。BayesC 或 GBLUP 等模型则假设 SNP 效应的一部分(BayesC)或全部(GBLUP)具有共同的(协)方差。在这项研究中,我们提出了一种基于将多个预定义 SNP 分组以考虑基因组中异质(协)方差结构的多性状贝叶斯全基因组回归方法(BayesN0)。该模型也在单步贝叶斯回归(ssBayesN0)中实现。为了实际实施,我们考虑了多性状单步 SNPBLUP 模型,使用 BayesN0 或 ssBayesN0 的(协)方差估计值。使用 2200 头丹麦荷斯坦奶牛前五个染色体上的单倍型模拟基因型数据,并模拟两个遗传力为 0.1 或 0.4 的性状的表型,假设有 200 个数量性状基因座(QTL)。我们比较了不同预测模型和不同区域大小(一个 SNP、100 个 SNP、一个染色体或整个基因组)的预测准确性。一般来说,当 100 个相邻 SNP 组合在一起时,可获得最高的准确性。ssBayesN0 提高了 BayesN0 的准确性,并且使用 ssBayesN0 的(协)方差估计值通常比使用 BayesN0 的(协)方差估计值获得更高的准确性,对于 100 个 SNP 区域大小。我们的结果表明,从 ssBayesN0 估计(协)方差分量,然后在常规基因组评估中使用多性状单步 SNPBLUP 进行基因组预测时使用这些估计值,可能是一种很好的策略。