Roso V M, Schenkel F S, Miller S P, Schaeffer L R
University of Guelph, Guelph, Ontario, N1G 2W1, Canada.
J Anim Sci. 2005 Aug;83(8):1788-800. doi: 10.2527/2005.8381788x.
Breed additive, dominance, and epistatic loss effects are of concern in the genetic evaluation of a multibreed population. Multiple regression equations used for fitting these effects may show a high degree of multicollinearity among predictor variables. Typically, when strong linear relationships exist, the regression coefficients have large SE and are sensitive to changes in the data file and to the addition or deletion of variables in the model. Generalized ridge regression methods were applied to obtain stable estimates of direct and maternal breed additive, dominance, and epistatic loss effects in the presence of multicollinearity among predictor variables. Preweaning weight gains of beef calves in Ontario, Canada, from 1986 to 1999 were analyzed. The genetic model included fixed direct and maternal breed additive, dominance, and epistatic loss effects, fixed environmental effects of age of the calf, contemporary group, and age of the dam x sex of the calf, random additive direct and maternal genetic effects, and random maternal permanent environment effect. The degree and the nature of the multicollinearity were identified and ridge regression methods were used as an alternative to ordinary least squares (LS). Ridge parameters were obtained using two different objective methods: 1) generalized ridge estimator of Hoerl and Kennard (R1); and 2) bootstrap in combination with cross-validation (R2). Both ridge regression methods outperformed the LS estimator with respect to mean squared error of predictions (MSEP) and variance inflation factors (VIF) computed over 100 bootstrap samples. The MSEP of R1 and R2 were similar, and they were 3% less than the MSEP of LS. The average VIF of LS, R1, and R2 were equal to 26.81, 6.10, and 4.18, respectively. Ridge regression methods were particularly effective in decreasing the multicollinearity involving predictor variables of breed additive effects. Because of a high degree of confounding between estimates of maternal dominance and direct epistatic loss effects, it was not possible to compare the relative importance of these effects with a high level of confidence. The inclusion of epistatic loss effects in the additive-dominance model did not cause noticeable reranking of sires, dams, and calves based on across-breed EBV. More precise estimates of breed effects as a result of this study may result in more stable across-breed estimated breeding values over the years.
在多品种群体的遗传评估中,品种加性、显性和上位性损失效应备受关注。用于拟合这些效应的多元回归方程可能显示预测变量之间存在高度多重共线性。通常,当存在强线性关系时,回归系数的标准误较大,并且对数据文件的变化以及模型中变量的增减很敏感。应用广义岭回归方法在预测变量之间存在多重共线性的情况下获得直接和母体品种加性、显性以及上位性损失效应的稳定估计。对1986年至1999年加拿大安大略省肉牛犊断奶前体重增长情况进行了分析。遗传模型包括固定的直接和母体品种加性、显性以及上位性损失效应,犊牛年龄、同期组以及母牛年龄×犊牛性别的固定环境效应,随机加性直接和母体遗传效应,以及随机母体永久环境效应。确定了多重共线性的程度和性质,并使用岭回归方法替代普通最小二乘法(LS)。使用两种不同的目标方法获得岭参数:1)霍尔和凯纳德的广义岭估计量(R1);2)结合交叉验证的自助法(R2)。在通过100个自助样本计算的预测均方误差(MSEP)和方差膨胀因子(VIF)方面,两种岭回归方法均优于LS估计量。R1和R2的MSEP相似,且比LS的MSEP小3%。LS、R1和R2的平均VIF分别等于26.81、6.10和4.18。岭回归方法在降低涉及品种加性效应预测变量的多重共线性方面特别有效。由于母体显性效应估计值与直接上位性损失效应之间存在高度混淆,因此无法高度自信地比较这些效应的相对重要性。在加性-显性模型中纳入上位性损失效应并未导致基于跨品种估计育种值的种公牛、种母牛和犊牛排名出现明显重新排序。本研究结果得出的更精确的品种效应估计值可能会使多年来的跨品种估计育种值更加稳定。