Liu Bogong, Liu Huichao, Tu Junhao, Xiao Jian, Yang Jie, He Xi, Zhang Haihan
College of Animal Science and Technology, Hunan Agricultural University, Changsha, Hunan, China.
Hunan Xiangjia Husbandry Co., Ltd, Changde, Hunan, China.
Poult Sci. 2025 Jan;104(1):104489. doi: 10.1016/j.psj.2024.104489. Epub 2024 Nov 1.
Machine learning (ML) methods have rapidly developed in various theoretical and practical research areas, including predicting genomic breeding values for large livestock animals. However, few studies have investigated the application of ML in broiler breeding. In this study, seven different ML methods-support vector regression (SVR), random forest (RF), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), kernel ridge regression (KRR) and multilayer perceptron (MLP) were employed to predict the genomic breeding values of laying traits, growth and carcass traits in a yellow-feathered broiler breeding population. The results indicated that classic methods, such as GBLUP and Bayesian, achieved superior prediction accuracy compared to ML methods in five of the eight traits. For half-eviscerated weight (HEW), ML methods showed an average improvement of 54.4% over GBLUP and Bayesian methods. Among the ML methods, SVR, RF, GBDT, and XGBoost exhibited improvements exceeding 60%, with respective values of 61.3%, 61.0%, 60.4%, and 60.7%; while MLP improved by 54.4% and LightGBM by 53.7%, KRR had the lowest improvement at 29.4%. For eviscerated weight (EW), ML methods still outperformed GBLUP and Bayesian methods. MLP gained the largest improvement at 19.0%, while SVR, RF, GBDT, XGBoost, LightGBM, and KRR improved by 15.0%, 16.5%, 9.5%, 7.0%, 1.6%, and 15.9%, respectively. Compared to default hyperparameters, the average improvement of ML methods with tuned hyperparameters was 34.0%, 32.9%, 27.0%, 19.3%, 26.8%, 13.2%, 18.9%, and 46.3%, respectively. The prediction accuracy of above algorithms could be optimized using genome-wide association study (GWAS) to select subsets of significant SNPs. This work provides valuable insights into genomic prediction, aiding genetic breeding in broilers.
机器学习(ML)方法已在包括大型家畜动物基因组育种值预测在内的各个理论和实践研究领域迅速发展。然而,很少有研究调查ML在肉鸡育种中的应用。在本研究中,采用了七种不同的ML方法——支持向量回归(SVR)、随机森林(RF)、梯度提升决策树(GBDT)、极端梯度提升(XGBoost)、轻量级梯度提升机(LightGBM)、核岭回归(KRR)和多层感知器(MLP)来预测黄羽肉鸡育种群体中产蛋性状、生长性状和胴体性状的基因组育种值。结果表明,在八个性状中的五个性状上,经典方法如GBLUP和贝叶斯方法比ML方法具有更高的预测准确性。对于半净膛重(HEW),ML方法比GBLUP和贝叶斯方法平均提高了54.4%。在ML方法中,SVR、RF、GBDT和XGBoost的改进超过60%,分别为61.3%、61.0%、60.4%和60.7%;而MLP提高了54.4%,LightGBM提高了53.7%,KRR的改进最低,为29.4%。对于全净膛重(EW),ML方法仍然优于GBLUP和贝叶斯方法。MLP的改进最大,为19.0%,而SVR、RF、GBDT、XGBoost、LightGBM和KRR分别提高了15.0%、16.5%、9.5%、7.0%、1.6%和15.9%。与默认超参数相比,调整超参数后的ML方法平均改进分别为34.0%、32.9%、27.0%、19.3%、26.8%、13.2%、18.9%和46.3%。使用全基因组关联研究(GWAS)选择显著单核苷酸多态性(SNP)子集可以优化上述算法的预测准确性。这项工作为基因组预测提供了有价值的见解,有助于肉鸡的遗传育种。