Ramstein Guillaume P, Casler Michael D
Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706
Department of Agronomy, University of Wisconsin-Madison, Madison, WI 53706.
G3 (Bethesda). 2019 Mar 7;9(3):789-805. doi: 10.1534/g3.118.200969.
Genomic prediction is a useful tool to accelerate genetic gain in selection using DNA marker information. However, this technology typically relies on standard prediction procedures, such as genomic BLUP, that are not designed to accommodate population heterogeneity resulting from differences in marker effects across populations. In this study, we assayed different prediction procedures to capture marker-by-population interactions in genomic prediction models. Prediction procedures included genomic BLUP and two kernel-based extensions of genomic BLUP which explicitly accounted for population heterogeneity. To model population heterogeneity, dissemblance between populations was either depicted by a unique coefficient (as previously reported), or a more flexible function of genetic distance between populations (proposed herein). Models under investigation were applied in a diverse switchgrass sample under two validation schemes: whole-sample calibration, where all individuals except selection candidates are included in the calibration set, and cross-population calibration, where the target population is entirely excluded from the calibration set. First, we showed that using fixed effects, from principal components or putative population groups, appeared detrimental to prediction accuracy, especially in cross-population calibration. Then we showed that modeling population heterogeneity by our proposed procedure resulted in highly significant improvements in model fit. In such cases, gains in accuracy were often positive. These results suggest that population heterogeneity may be parsimoniously captured by kernel methods. However, in cases where improvement in model fit by our proposed procedure is null-to-moderate, ignoring heterogeneity should probably be preferred due to the robustness and simplicity of the standard genomic BLUP model.
基因组预测是利用DNA标记信息加速选择中遗传增益的一种有用工具。然而,这项技术通常依赖于标准预测程序,如基因组最佳线性无偏预测(genomic BLUP),这些程序并非设计用于适应因不同群体间标记效应差异而导致的群体异质性。在本研究中,我们分析了不同的预测程序,以捕捉基因组预测模型中标记与群体的相互作用。预测程序包括基因组BLUP以及基因组BLUP的两种基于核的扩展方法,它们明确考虑了群体异质性。为了对群体异质性进行建模,群体间的差异要么用一个独特的系数来描述(如先前报道),要么用群体间遗传距离的更灵活函数来描述(本文提出)。所研究的模型在两种验证方案下应用于一个多样化的柳枝稷样本:全样本校准,即将除选择候选个体之外的所有个体纳入校准集;跨群体校准,即将目标群体完全排除在校准集之外。首先,我们表明使用来自主成分或假定群体组的固定效应似乎对预测准确性不利,尤其是在跨群体校准中。然后我们表明,通过我们提出的程序对群体异质性进行建模会使模型拟合有显著改善。在这种情况下,准确性的提高往往是正向的。这些结果表明,群体异质性可以通过核方法简约地捕捉。然而,在我们提出的程序对模型拟合的改善为零到中等的情况下,由于标准基因组BLUP模型的稳健性和简单性,可能更倾向于忽略异质性。