Avagyan Vahe, Boer Martin P, Solin Junita, van Dijk Aalt D J, Bustos-Korts Daniela, van Rossum Bart-Jan, Ramakers Jip J C, van Eeuwijk Fred, Kruijer Willem
Biometris, Wageningen University and Research, Wageningen, The Netherlands.
Bioinformatics, Wageningen University and Research, Wageningen, The Netherlands.
Theor Appl Genet. 2025 Mar 28;138(4):88. doi: 10.1007/s00122-025-04865-4.
Penalized factorial regression offers a computationally attractive alternative to kernel and deep learning methods for prediction of genotype by environment interactions. For two representative data sets on wheat and maize, prediction accuracies were comparable, while computing requirements and time were clearly lower. A longstanding challenge in plant breeding and genetics is the prediction of yield for new environments in the presence of genotype by environment interaction ( ). The genotypes in this case are promising candidate varieties at an advanced stage of breeding programs or are part of statutory variety trials or post registration trials. The genotypes have been tested in a limited set of trials and the question is how these genotypes will perform in future growing conditions. A reaction norm approach seems adequate to address this challenge. Reaction norms are functions with genotype-specific parameters that express the phenotype as a function of environmental inputs. follows from differences in genotype-specific slope or rate parameters. Prediction of yield for new environments requires the identification of suitable reaction norm functions and the estimation of genotype-specific parameters together with knowledge about the environmental conditions. Here, we present penalized factorial regression with simple linear reaction norms for individual genotypes whose slopes are regularized by imposing a penalty upon them. Different types of penalization provide shrinkage, automatic selection of environmental covariates (EC's) and protection against overfitting for prediction of yield with medium to large numbers of EC's. Illustrations of our approach are given for a maize and a wheat data set. For these data, our approach compares well to alternative methods based on Bayesian regression and deep learning with respect to prediction accuracy, while computational demands are clearly lower.
惩罚因子回归为预测基因型与环境的相互作用提供了一种在计算上颇具吸引力的替代核方法和深度学习方法的选择。对于两个关于小麦和玉米的代表性数据集,预测准确率相当,但计算需求和时间明显更低。植物育种和遗传学中一个长期存在的挑战是在存在基因型与环境相互作用的情况下预测新环境中的产量( )。在这种情况下,基因型是育种计划后期有前景的候选品种,或者是法定品种试验或注册后试验的一部分。这些基因型已在一组有限的试验中进行了测试,问题是这些基因型在未来生长条件下会表现如何。反应规范方法似乎足以应对这一挑战。反应规范是具有基因型特异性参数的函数,它将表型表示为环境输入的函数。这源于基因型特异性斜率或速率参数的差异。预测新环境中的产量需要识别合适的反应规范函数,估计基因型特异性参数以及了解环境条件。在这里,我们提出了带有简单线性反应规范的惩罚因子回归,用于个体基因型,其斜率通过对其施加惩罚来进行正则化。不同类型的惩罚提供了收缩、环境协变量(EC)的自动选择以及防止在有中大量EC的情况下预测产量时的过拟合。我们的方法在一个玉米和一个小麦数据集上进行了说明。对于这些数据,我们的方法在预测准确性方面与基于贝叶斯回归和深度学习的替代方法相比具有优势,同时计算需求明显更低。