Department of Biostatistics, University of Alabama at Birmingham, 1665 University Boulevard, 327L Ryals Public Health Building, Birmingham, AL, 35216, USA.
Theor Appl Genet. 2014 Mar;127(3):595-607. doi: 10.1007/s00122-013-2243-1. Epub 2013 Dec 12.
New methods that incorporate the main and interaction effects of high-dimensional markers and of high-dimensional environmental covariates gave increased prediction accuracy of grain yield in wheat across and within environments. In most agricultural crops the effects of genes on traits are modulated by environmental conditions, leading to genetic by environmental interaction (G × E). Modern genotyping technologies allow characterizing genomes in great detail and modern information systems can generate large volumes of environmental data. In principle, G × E can be accounted for using interactions between markers and environmental covariates (ECs). However, when genotypic and environmental information is high dimensional, modeling all possible interactions explicitly becomes infeasible. In this article we show how to model interactions between high-dimensional sets of markers and ECs using covariance functions. The model presented here consists of (random) reaction norm where the genetic and environmental gradients are described as linear functions of markers and of ECs, respectively. We assessed the proposed method using data from Arvalis, consisting of 139 wheat lines genotyped with 2,395 SNPs and evaluated for grain yield over 8 years and various locations within northern France. A total of 68 ECs, defined based on five phases of the phenology of the crop, were used in the analysis. Interaction terms accounted for a sizable proportion (16 %) of the within-environment yield variance, and the prediction accuracy of models including interaction terms was substantially higher (17-34 %) than that of models based on main effects only. Breeding for target environmental conditions has become a central priority of most breeding programs. Methods, like the one presented here, that can capitalize upon the wealth of genomic and environmental information available, will become increasingly important.
新方法结合了高维标记和高维环境协变量的主效应和交互效应,提高了小麦在不同环境和同一环境下的产量预测准确性。在大多数农作物中,基因对性状的影响受到环境条件的调节,导致基因与环境的相互作用(G×E)。现代基因分型技术允许对基因组进行详细描述,现代信息系统可以生成大量的环境数据。原则上,可以通过标记和环境协变量(ECs)之间的相互作用来解释 G×E。然而,当基因型和环境信息具有高维性时,显式地对所有可能的相互作用进行建模变得不可行。在本文中,我们展示了如何使用协方差函数来模拟高维标记和 ECs 之间的相互作用。提出的模型由(随机)反应规范组成,其中遗传和环境梯度分别描述为标记和 ECs 的线性函数。我们使用来自 Arvalis 的数据来评估所提出的方法,该数据包括 139 个小麦系,使用 2395 个 SNP 进行基因分型,并在法国北部的 8 年和多个地点评估了产量。共使用了 68 个基于作物物候学五个阶段定义的 ECs。交互项占环境内产量方差的相当大比例(16%),并且包含交互项的模型的预测准确性(17-34%)显著高于仅基于主效应的模型。针对目标环境条件的育种已成为大多数育种计划的核心重点。像这里提出的方法,能够充分利用可用的基因组和环境信息,将变得越来越重要。