Lian Lian, de Los Campos Gustavo
Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan 48824
Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan 48824 Department of Probability and Statistics, Michigan State University, East Lansing, Michigan 48824.
G3 (Bethesda). 2015 Dec 29;6(3):589-97. doi: 10.1534/g3.115.026328.
The Finlay-Wilkinson regression (FW) is a popular method among plant breeders to describe genotype by environment interaction. The standard implementation is a two-step procedure that uses environment (sample) means as covariates in a within-line ordinary least squares (OLS) regression. This procedure can be suboptimal for at least four reasons: (1) in the first step environmental means are typically estimated without considering genetic-by-environment interactions, (2) in the second step uncertainty about the environmental means is ignored, (3) estimation is performed regarding lines and environment as fixed effects, and (4) the procedure does not incorporate genetic (either pedigree-derived or marker-derived) relationships. Su et al. proposed to address these problems using a Bayesian method that allows simultaneous estimation of environmental and genotype parameters, and allows incorporation of pedigree information. In this article we: (1) extend the model presented by Su et al. to allow integration of genomic information [e.g., single nucleotide polymorphism (SNP)] and covariance between environments, (2) present an R package (FW) that implements these methods, and (3) illustrate the use of the package using examples based on real data. The FW R package implements both the two-step OLS method and a full Bayesian approach for Finlay-Wilkinson regression with a very simple interface. Using a real wheat data set we demonstrate that the prediction accuracy of the Bayesian approach is consistently higher than the one achieved by the two-step OLS method.
芬利 - 威尔金森回归(FW)是植物育种者中一种流行的描述基因型与环境互作的方法。标准的实现方式是一个两步程序,该程序在品系内普通最小二乘法(OLS)回归中使用环境(样本)均值作为协变量。此程序至少在四个方面可能不是最优的:(1)在第一步中,环境均值通常在不考虑基因与环境互作的情况下进行估计;(2)在第二步中,环境均值的不确定性被忽略;(3)估计是将品系和环境视为固定效应进行的;(4)该程序没有纳入遗传(系谱衍生或标记衍生)关系。苏等人提出使用贝叶斯方法来解决这些问题,该方法允许同时估计环境和基因型参数,并允许纳入系谱信息。在本文中,我们:(1)扩展了苏等人提出的模型,以允许整合基因组信息[例如,单核苷酸多态性(SNP)]和环境间的协方差;(2)展示了一个实现这些方法的R包(FW);(3)通过基于实际数据的示例来说明该包的使用。FW R包通过一个非常简单的界面实现了芬利 - 威尔金森回归的两步OLS方法和全贝叶斯方法。使用一个真实的小麦数据集,我们证明贝叶斯方法的预测准确性始终高于两步OLS方法所达到的准确性。