The CRC for Innovative Dairy Products, Australia.
Genet Sel Evol. 2009 Dec 31;41(1):56. doi: 10.1186/1297-9686-41-56.
Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle.
Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls.
For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy.All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time.
The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended.
基因组选择(GS)使用整个基因组中密集标记的分子育种值(MBV)对幼畜进行选择。MBV 预测的准确性对于 GS 的成功应用很重要。最近,已经提出了几种估计 MBV 的方法。初步模拟研究表明,这些方法可以准确地预测 MBV。在本研究中,我们在奶牛的实证应用中比较了五种不同回归方法的准确性和可能的偏差。
使用 7372 个 SNP 的基因型和 1945 头公牛的高度准确 EBV,预测蛋白质百分比(PPT)和利润指数(澳大利亚选择指数,ASI)的 MBV。在 1239 头公牛的训练集中,通过最小二乘回归(FR-LS)、贝叶斯回归(Bayes-R)、随机回归最佳线性无偏预测(RR-BLUP)、偏最小二乘回归(PLSR)和非参数支持向量回归(SVR)估计标记效应。从训练集的交叉验证中计算 MBV 预测的准确性和偏差,并在 706 头年轻公牛的测试组中进行测试。
对于两种性状,使用 SNP 子集的 FR-LS 明显不如使用所有 SNP 的所有其他方法准确。对于 ASI(0.39-0.45)和 PPT(0.55-0.61),Bayes-R、RR-BLUP、PLSR 和 SVR 获得的准确性非常相似。总体而言,SVR 的准确性最高。所有方法对 ASI 的 MBV 预测均存在偏差,而仅 RR-BLUP 和 SVR 预测对 PPT 无偏差。与从训练集交叉验证中得出的准确性相比,年轻公牛测试组中 ASI 的预测准确性显著降低。对于 PPT,这种降低并不明显。与仅基于系谱的预测相比,结合 MBV 预测和系谱预测可将准确性提高 1.05-1.34 倍。某些方法具有很大不同的计算要求,其中 PLSR 和 RR-BLUP 需要最少的计算时间。
使用所有 SNP 信息的四种方法,即 RR-BLUP、Bayes-R、PLSR 和 SVR,对基因组选择的 MBV 预测具有相似的准确性,并且它们在奶牛的下一代选择中的使用将是可比的。不建议在基因组选择中使用 FR-LS。