Hong Joon-Ki, Kim Yong-Min, Cho Eun-Seok, Lee Jae-Bong, Kim Young-Sin, Park Hee-Bok
Swine Division, National Institute of Animal Science, Rural Development Administration, Cheonan 31000, Korea.
Korea Zoonosis Research Institute, Jeonbuk National University, Iksan 54531, Korea.
Anim Biosci. 2024 Apr;37(4):622-630. doi: 10.5713/ab.23.0264. Epub 2024 Jan 14.
Pig breeders cannot obtain phenotypic information at the time of selection for sow lifetime productivity (SLP). They would benefit from obtaining genetic information of candidate sows. Genomic data interpreted using deep learning (DL) techniques could contribute to the genetic improvement of SLP to maximize farm profitability because DL models capture nonlinear genetic effects such as dominance and epistasis more efficiently than conventional genomic prediction methods based on linear models. This study aimed to investigate the usefulness of DL for the genomic prediction of two SLP-related traits; lifetime number of litters (LNL) and lifetime pig production (LPP).
Two bivariate DL models, convolutional neural network (CNN) and local convolutional neural network (LCNN), were compared with conventional bivariate linear models (i.e., genomic best linear unbiased prediction, Bayesian ridge regression, Bayes A, and Bayes B). Phenotype and pedigree data were collected from 40,011 sows that had husbandry records. Among these, 3,652 pigs were genotyped using the PorcineSNP60K BeadChip.
The best predictive correlation for LNL was obtained with CNN (0.28), followed by LCNN (0.26) and conventional linear models (approximately 0.21). For LPP, the best predictive correlation was also obtained with CNN (0.29), followed by LCNN (0.27) and conventional linear models (approximately 0.25). A similar trend was observed with the mean squared error of prediction for the SLP traits.
This study provides an example of a CNN that can outperform against the linear model-based genomic prediction approaches when the nonlinear interaction components are important because LNL and LPP exhibited strong epistatic interaction components. Additionally, our results suggest that applying bivariate DL models could also contribute to the prediction accuracy by utilizing the genetic correlation between LNL and LPP.
养猪生产者在选择后备母猪时无法获取其终生繁殖性能(SLP)的表型信息。获取候选母猪的遗传信息将对他们有所帮助。利用深度学习(DL)技术解读基因组数据有助于提高SLP的遗传改良,从而实现农场利润最大化,因为与基于线性模型的传统基因组预测方法相比,DL模型能更有效地捕捉显性和上位性等非线性遗传效应。本研究旨在探讨DL在两个与SLP相关性状的基因组预测中的实用性;终生产仔数(LNL)和终生生猪产量(LPP)。
将两个双变量DL模型,即卷积神经网络(CNN)和局部卷积神经网络(LCNN),与传统双变量线性模型(即基因组最佳线性无偏预测、贝叶斯岭回归、贝叶斯A和贝叶斯B)进行比较。从40,011头有饲养记录的母猪中收集表型和系谱数据。其中,3,652头猪使用猪60K SNP芯片进行基因分型。
LNL的最佳预测相关性由CNN获得(0.28),其次是LCNN(0.26)和传统线性模型(约0.21)。对于LPP,最佳预测相关性同样由CNN获得(0.29),其次是LCNN(0.27)和传统线性模型(约0.25)。在SLP性状的预测均方误差方面也观察到类似趋势。
本研究提供了一个例子,当非线性相互作用成分很重要时,CNN可以优于基于线性模型的基因组预测方法,因为LNL和LPP表现出很强的上位性相互作用成分。此外,我们的结果表明,应用双变量DL模型还可以通过利用LNL和LPP之间的遗传相关性来提高预测准确性。