Oakey Helena, Cullis Brian, Thompson Robin, Comadran Jordi, Halpin Claire, Waugh Robbie
Division of Plant Sciences, University of Dundee at the James Hutton Institute, Invergowrie, Dundee DD2 5DA, Scotland, UK.
National Institute for Applied Statistics Research Australia, University of Wollongong, NSW, 2522, Australia.
G3 (Bethesda). 2016 May 3;6(5):1313-26. doi: 10.1534/g3.116.027524.
Genomic selection in crop breeding introduces modeling challenges not found in animal studies. These include the need to accommodate replicate plants for each line, consider spatial variation in field trials, address line by environment interactions, and capture nonadditive effects. Here, we propose a flexible single-stage genomic selection approach that resolves these issues. Our linear mixed model incorporates spatial variation through environment-specific terms, and also randomization-based design terms. It considers marker, and marker by environment interactions using ridge regression best linear unbiased prediction to extend genomic selection to multiple environments. Since the approach uses the raw data from line replicates, the line genetic variation is partitioned into marker and nonmarker residual genetic variation (i.e., additive and nonadditive effects). This results in a more precise estimate of marker genetic effects. Using barley height data from trials, in 2 different years, of up to 477 cultivars, we demonstrate that our new genomic selection model improves predictions compared to current models. Analyzing single trials revealed improvements in predictive ability of up to 5.7%. For the multiple environment trial (MET) model, combining both year trials improved predictive ability up to 11.4% compared to a single environment analysis. Benefits were significant even when fewer markers were used. Compared to a single-year standard model run with 3490 markers, our partitioned MET model achieved the same predictive ability using between 500 and 1000 markers depending on the trial. Our approach can be used to increase accuracy and confidence in the selection of the best lines for breeding and/or, to reduce costs by using fewer markers.
作物育种中的基因组选择带来了动物研究中未发现的建模挑战。这些挑战包括需要为每个品系配备重复植株、考虑田间试验中的空间变异、处理品系与环境的相互作用以及捕捉非加性效应。在此,我们提出一种灵活的单阶段基因组选择方法来解决这些问题。我们的线性混合模型通过特定环境项纳入空间变异,还纳入基于随机化的设计项。它使用岭回归最佳线性无偏预测来考虑标记以及标记与环境的相互作用,从而将基因组选择扩展到多个环境。由于该方法使用品系重复的原始数据,品系遗传变异被划分为标记和非标记残余遗传变异(即加性和非加性效应)。这导致对标记遗传效应的估计更加精确。利用来自多达477个品种在2个不同年份试验的大麦株高数据,我们证明与当前模型相比,我们的新基因组选择模型改进了预测。对单个试验的分析显示预测能力提高了5.7%。对于多环境试验(MET)模型,与单环境分析相比,将两年试验结合起来使预测能力提高了11.4%。即使使用较少的标记,益处也很显著。与使用3490个标记运行的单一年度标准模型相比,我们的分区MET模型根据试验使用500至1000个标记就能达到相同的预测能力。我们的方法可用于提高育种最佳品系选择的准确性和可信度,和/或通过使用更少的标记来降低成本。