Suppr超能文献

利用红外光谱和梯度提升机预测荷斯坦奶牛育种用乳蛋白组分

Predicting milk protein fractions using infrared spectroscopy and a gradient boosting machine for breeding purposes in Holstein cattle.

作者信息

Macedo Mota L F, Bisutti V, Vanzin A, Pegolo S, Toscano A, Schiavon S, Tagliapietra F, Gallo L, Ajmone Marsan P, Cecchinato A

机构信息

Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell' Università 16, 35020 Legnaro, Italy.

Department of Agronomy, Food, Natural Resources, Animals and Environment (DAFNAE), University of Padova, Viale dell' Università 16, 35020 Legnaro, Italy.

出版信息

J Dairy Sci. 2023 Mar;106(3):1853-1873. doi: 10.3168/jds.2022-22119. Epub 2023 Jan 27.

Abstract

In recent years, increasing attention has been focused on the genetic evaluation of protein fractions in cow milk with the aim of improving milk quality and technological characteristics. In this context, advances in high-throughput phenotyping by Fourier transform infrared (FTIR) spectroscopy offer the opportunity for large-scale, efficient measurement of novel traits that can be exploited in breeding programs as indicator traits. We took milk samples from 2,558 Holstein cows belonging to 38 herds in northern Italy, operating under different production systems. Fourier transform infrared spectra were collected on the same day as milk sampling and stored for subsequent analysis. Two sets of data (i.e., phenotypes and FTIR spectra) collected in 2 different years (2013 and 2019-2020) were compiled. The following traits were assessed using HPLC: true protein, major casein fractions [α-casein (CN), α-CN, β-CN, κ-CN, and glycosylated-κ-CN], and major whey proteins (β-lactoglobulin and α-lactalbumin), all of which were measured both in grams per liter (g/L) and proportion of total nitrogen (% N). The FTIR predictions were calculated using the gradient boosting machine technique and tested by 3 different cross-validation (CRV) methods. We used the following CRV scenarios: (1) random 10-fold, which randomly split the whole into 10-folds of equal size (9-folds for training and 1-fold for validation); (2) herd/date-out CRV, which assigned 80% of herd/date as the training set with independence of 20% of herd/date assigned as the validation set; (3) forward/backward CRV, which split the data set in training and validation set according with the year of milk sampling (FTIR and gold standard data assessed in 2013 or 2019-2020) using the "old" and "new" databases for training and validation, and vice-versa with independence among them; (4) the CRV for genetic parameters (CRV-gen), where animals without pedigree as assigned as a fixed training population and animals with pedigree information was split in 5-folds, in which 1-fold was assigned to the fixed training population, and 4-folds were assigned to the validation set (independent from the training set). The results (i.e., measures and predictions) of CRV-gen were used to infer the genetic parameters for gold standard laboratory measurements (i.e., proteins assessed with HPLC) and FTIR-based predictions considering the CRV-gen scenario from a bi-trait animal model using single-step genomic BLUP. We found that the prediction accuracies of the gradient boosting machine equations differed according to the way in which the proteins were expressed, achieving higher accuracy when expressed in g/L than when expressed as % N in all CRV scenarios. Concerning the reproducibility of the equations over the different years, the results showed no relevant differences in predictive ability between using "old" data as the training set and "new" data as the validation set and vice-versa. Comparing the additive genetic variance estimates for milk protein fractions between the FTIR predicted and HPLC measures, we found reductions of -19.7% for milk protein fractions expressed in g/L, and -21.19% expressed as % N. Although we found reductions in the heritability estimates, they were small, with values ranging from -1.9 to -7.25% for g/L, and -1.6 to -7.9% for % N. The posterior distributions of the additive genetic correlations (r) between the FTIR predictions and the laboratory measurements were generally high (>0.8), even when the milk protein fractions were expressed as % N. Our results show the potential of using FTIR predictions in breeding programs as indicator traits for the selection of animals to enhance milk protein fraction contents. We expect acceptable responses to selection due to the high genetic correlations between HPLC measurements and FTIR predictions.

摘要

近年来,人们越来越关注牛奶中蛋白质组分的遗传评估,旨在提高牛奶质量和工艺特性。在此背景下,傅里叶变换红外(FTIR)光谱技术在高通量表型分析方面的进展为大规模、高效测量新性状提供了机会,这些新性状可作为指示性状应用于育种计划。我们从意大利北部38个牛群的2558头荷斯坦奶牛中采集了牛奶样本,这些牛群采用不同的生产系统。在牛奶采样当天收集傅里叶变换红外光谱,并储存以备后续分析。汇编了在2个不同年份(2013年和2019 - 2020年)收集的两组数据(即表型和FTIR光谱)。使用高效液相色谱法评估了以下性状:真蛋白、主要酪蛋白组分[α - 酪蛋白(CN)、α - CN、β - CN、κ - CN和糖基化κ - CN]以及主要乳清蛋白(β - 乳球蛋白和α - 乳白蛋白),所有这些均以克每升(g/L)和总氮比例(%N)进行测量。使用梯度提升机技术计算FTIR预测值,并通过3种不同的交叉验证(CRV)方法进行测试。我们使用了以下CRV方案:(1)随机10折交叉验证,即将整体随机分为10个大小相等的折(9折用于训练,1折用于验证);(2)牛群/日期排除交叉验证,将80%的牛群/日期分配为训练集,独立地将20%的牛群/日期分配为验证集;(3)向前/向后交叉验证,根据牛奶采样年份(2013年或2019 - 2020年评估的FTIR和金标准数据)将数据集分为训练集和验证集,使用“旧”和“新”数据库进行训练和验证,反之亦然,它们之间相互独立;(4)遗传参数交叉验证(CRV - gen),将无系谱的动物指定为固定训练群体,有系谱信息的动物分为5折,其中1折分配给固定训练群体,4折分配给验证集(与训练集独立)。CRV - gen的结果(即测量值和预测值)用于推断金标准实验室测量(即通过高效液相色谱法评估的蛋白质)的遗传参数以及基于FTIR预测的遗传参数,考虑使用单步基因组最佳线性无偏预测(BLUP)的双性状动物模型中的CRV - gen方案。我们发现,梯度提升机方程的预测准确性因蛋白质的表达方式而异,在所有CRV方案中,以g/L表示时比以%N表示时具有更高的准确性。关于方程在不同年份的可重复性,结果表明,使用“旧”数据作为训练集和“新”数据作为验证集以及反之亦然时,预测能力没有相关差异。比较FTIR预测和高效液相色谱法测量的牛奶蛋白质组分的加性遗传方差估计值,我们发现以g/L表示的牛奶蛋白质组分降低了 - 19.7%,以%N表示时降低了 - 21.19%。尽管我们发现遗传力估计值有所降低,但幅度较小,以g/L表示时范围为 - 1.9%至 - 7.25%,以%N表示时为 - 1.6%至 - 7.9%。即使牛奶蛋白质组分以%N表示,FTIR预测与实验室测量之间的加性遗传相关性(r)后验分布通常也很高(>0.8)。我们的结果表明,在育种计划中使用FTIR预测作为指示性状来选择动物以提高牛奶蛋白质组分含量具有潜力。由于高效液相色谱法测量与FTIR预测之间的高遗传相关性,我们预计会有可接受的选择响应。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验