Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China.
J Anim Breed Genet. 2021 May;138(3):291-299. doi: 10.1111/jbg.12514. Epub 2020 Oct 22.
Genomic selection (GS) using the whole-genome molecular makers to predict genomic estimated breeding values (GEBVs) is revolutionizing the livestock and plant breeding. Seeking out novel strategies with higher prediction accuracy for GS has been the ultimate goal of breeders. With the rapid development of artificial intelligence, machine learning algorithms were applied to estimate the GEBVs increasingly. Although some machine learning methods have better performance in phenotype prediction, there is still considerable room for improvement. In this study, we applied an ensemble-learning algorithm, Adaboost.RT, which integrated support vector regression (SVR), kernel ridge regression (KRR) and random forest (RF), to predict genomic breeding values of three economic traits (carcass weight, live weight, and eye muscle area) in Chinese Simmental beef cattle. Predictive accuracy measured as the Pearson correlation between the corrected phenotypes and predicted GEBVs. Moreover, we compared the reliability of SVR, KRR, RF, Adaboost.RT and GBLUP methods. The result showed that machine learning methods outperformed GBLUP, and the average improvement of four machine learning methods over the GBLUP was 12.8%, 14.9%, 5.4% and 14.4%, respectively. Among the four machine learning methods, the reliability of Adaboost.RT was comparable to KRR with higher stability. We therefore believe that the Adaboost.RT algorithm is a reliable and efficient method for GS.
全基因组分子标记物的基因组选择(GS)用于预测基因组估计育种值(GEBV),正在彻底改变家畜和植物的育种方式。寻找具有更高预测准确性的新策略一直是育种者的最终目标。随着人工智能的快速发展,机器学习算法被越来越多地应用于估计 GEBVs。虽然一些机器学习方法在表型预测方面表现更好,但仍有相当大的改进空间。在这项研究中,我们应用了一种集成学习算法,即 Adaboost.RT,它集成了支持向量回归(SVR)、核脊回归(KRR)和随机森林(RF),用于预测中国西门塔尔牛肉牛三个经济性状(胴体重、活重和眼肌面积)的基因组育种值。预测准确性的衡量标准是校正表型和预测 GEBVs 之间的皮尔逊相关系数。此外,我们还比较了 SVR、KRR、RF、Adaboost.RT 和 GBLUP 方法的可靠性。结果表明,机器学习方法优于 GBLUP,四种机器学习方法相对于 GBLUP 的平均改进分别为 12.8%、14.9%、5.4%和 14.4%。在这四种机器学习方法中,Adaboost.RT 的可靠性与 KRR 相当,稳定性更高。因此,我们认为 Adaboost.RT 算法是 GS 的一种可靠且高效的方法。