基于基因组学的方法预测猪产仔数的比较。

Genome-enabled methods for predicting litter size in pigs: a comparison.

机构信息

1 Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA.

出版信息

Animal. 2013 Nov;7(11):1739-49. doi: 10.1017/S1751731113001389. Epub 2013 Jul 24.

Abstract

Predictive ability of models for litter size in swine on the basis of different sources of genetic information was investigated. Data represented average litter size on 2598, 1604 and 1897 60K genotyped sows from two purebred and one crossbred line, respectively. The average correlation (r) between observed and predicted phenotypes in a 10-fold cross-validation was used to assess predictive ability. Models were: pedigree-based mixed-effects model (PED), Bayesian ridge regression (BRR), Bayesian LASSO (BL), genomic BLUP (GBLUP), reproducing kernel Hilbert spaces regression (RKHS), Bayesian regularized neural networks (BRNN) and radial basis function neural networks (RBFNN). BRR and BL used the marker matrix or its principal component scores matrix (UD) as covariates; RKHS employed a Gaussian kernel with additive codes for markers whereas neural networks employed the additive genomic relationship matrix (G) or UD as inputs. The non-parametric models (RKHS, BRNN, RNFNN) gave similar predictions to the parametric counterparts (average r ranged from 0.15 to 0.23); most of the genome-based models outperformed PED (r = 0.16). Predictive abilities of linear models and RKHS were similar over lines, but BRNN varied markedly, giving the best prediction (r = 0.31) when G was used in crossbreds, but the worst (r = 0.02) when the G matrix was used in one of the purebred lines. The r values for RBFNN ranged from 0.16 to 0.23. Predictive ability was better in crossbreds (0.26) than in purebreds (0.15 to 0.22). This may be related to family structure in the purebred lines.

摘要

研究了基于不同遗传信息来源的猪窝产仔数模型的预测能力。数据代表了来自两个纯种和一个杂交系的 2598、1604 和 1897 头 60K 基因分型母猪的平均窝产仔数。使用 10 倍交叉验证中的观察值和预测值之间的平均相关系数 (r) 来评估预测能力。模型为：基于系谱的混合效应模型 (PED)、贝叶斯岭回归 (BRR)、贝叶斯 LASSO (BL)、基因组 BLUP (GBLUP)、再生核希尔伯特空间回归 (RKHS)、贝叶斯正则化神经网络 (BRNN) 和径向基函数神经网络 (RBFNN)。BRR 和 BL 使用标记矩阵或其主成分得分矩阵 (UD) 作为协变量；RKHS 使用加性标记码的高斯核，而神经网络使用加性基因组关系矩阵 (G) 或 UD 作为输入。非参数模型 (RKHS、BRNN、RBFNN) 与参数模型的预测结果相似（平均 r 值范围为 0.15 至 0.23）；大多数基于基因组的模型优于 PED（r = 0.16）。线性模型和 RKHS 的预测能力在系谱上相似，但 BRNN 差异明显，当在杂交品种中使用 G 时，预测效果最佳（r = 0.31），但当在其中一个纯种系中使用 G 矩阵时，预测效果最差（r = 0.02）。RBFNN 的 r 值范围为 0.16 至 0.23。在杂交品种中的预测能力（0.26）优于纯种（0.15 至 0.22）。这可能与纯种系的家族结构有关。