The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, United Kingdom.
Cotton Product Design, Bayer CropScience, St Louis, USA.
Theor Appl Genet. 2022 Oct;135(10):3393-3415. doi: 10.1007/s00122-022-04186-w. Epub 2022 Sep 6.
The integration of known and latent environmental covariates within a single-stage genomic selection approach provides breeders with an informative and practical framework to utilise genotype by environment interaction for prediction into current and future environments. This paper develops a single-stage genomic selection approach which integrates known and latent environmental covariates within a special factor analytic framework. The factor analytic linear mixed model of Smith et al. (2001) is an effective method for analysing multi-environment trial (MET) datasets, but has limited practicality since the underlying factors are latent so the modelled genotype by environment interaction (GEI) is observable, rather than predictable. The advantage of using random regressions on known environmental covariates, such as soil moisture and daily temperature, is that the modelled GEI becomes predictable. The integrated factor analytic linear mixed model (IFA-LMM) developed in this paper includes a model for predictable and observable GEI in terms of a joint set of known and latent environmental covariates. The IFA-LMM is demonstrated on a late-stage cotton breeding MET dataset from Bayer CropScience. The results show that the known covariates predominately capture crossover GEI and explain 34.4% of the overall genetic variance. The most notable covariates are maximum downward solar radiation (10.1%), average cloud cover (4.5%) and maximum temperature (4.0%). The latent covariates predominately capture non-crossover GEI and explain 40.5% of the overall genetic variance. The results also show that the average prediction accuracy of the IFA-LMM is [Formula: see text] higher than conventional random regression models for current environments and [Formula: see text] higher for future environments. The IFA-LMM is therefore an effective method for analysing MET datasets which also utilises crossover and non-crossover GEI for genomic prediction into current and future environments. This is becoming increasingly important with the emergence of rapidly changing environments and climate change.
在单阶段基因组选择方法中整合已知和潜在环境协变量,为育种者提供了一个信息丰富且实用的框架,以利用基因型与环境互作进行预测,适用于当前和未来的环境。本文开发了一种单阶段基因组选择方法,该方法在特殊的因子分析框架内整合了已知和潜在的环境协变量。Smith 等人(2001 年)的因子分析线性混合模型是分析多环境试验(MET)数据集的有效方法,但实用性有限,因为潜在因素是潜在的,因此所建模的基因型与环境互作(GEI)是可观测的,而不是可预测的。使用已知环境协变量(如土壤湿度和日温度)的随机回归的优点是,所建模的 GEI 变得可预测。本文开发的综合因子分析线性混合模型(IFA-LMM)包括一个基于已知和潜在环境协变量的共同集来预测和观测可预测的 GEI 的模型。IFA-LMM 在拜耳作物科学公司的一个后期棉花育种 MET 数据集上进行了演示。结果表明,已知协变量主要捕获交叉 GEI,并解释了总遗传方差的 34.4%。最显著的协变量是最大向下太阳辐射(10.1%)、平均云量(4.5%)和最高温度(4.0%)。潜在协变量主要捕获非交叉 GEI,并解释了总遗传方差的 40.5%。结果还表明,IFA-LMM 的平均预测准确性对于当前环境而言,比传统的随机回归模型高出[Formula: see text],对于未来环境而言,高出[Formula: see text]。因此,IFA-LMM 是一种有效的方法,可用于分析 MET 数据集,同时还可利用交叉和非交叉 GEI 进行当前和未来环境的基因组预测。随着快速变化的环境和气候变化的出现,这一点变得越来越重要。