Mikshowsky Ashley A, Gianola Daniel, Weigel Kent A
Department of Dairy Science, University of Wisconsin, Madison 53706.
Department of Dairy Science, University of Wisconsin, Madison 53706; Department of Animal Sciences, University of Wisconsin, Madison 53706; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison 53706.
J Dairy Sci. 2017 Jan;100(1):453-464. doi: 10.3168/jds.2016-11496. Epub 2016 Nov 23.
Since the introduction of genome-enabled prediction for dairy cattle in 2009, genomic selection has markedly changed many aspects of the dairy genetics industry and enhanced the rate of response to selection for most economically important traits. Young dairy bulls are genotyped to obtain their genomic predicted transmitting ability (GPTA) and reliability (REL) values. These GPTA are a main factor in most purchasing, marketing, and culling decisions until bulls reach 5 yr of age and their milk-recorded offspring become available. At that time, daughter yield deviations (DYD) can be compared with the GPTA computed several years earlier. For most bulls, the DYD align well with the initial predictions. However, for some bulls, the difference between DYD and corresponding GPTA is quite large, and published REL are of limited value in identifying such bulls. A method of bootstrap aggregation sampling (bagging) using genomic BLUP (GBLUP) was applied to predict the GPTA of 2,963, 2,963, and 2,803 young Holstein bulls for protein yield, somatic cell score, and daughter pregnancy rate (DPR), respectively. For each trait, 50 bootstrap samples from a reference population comprising 2011 DYD of 8,610, 8,405, and 7,945 older Holstein bulls were used. Leave-one-out cross validation was also performed to assess prediction accuracy when removing specific bulls from the reference population. The main objectives of this study were (1) to assess the extent to which current REL values and alternative measures of variability, such as the bootstrap standard deviation (SD) of predictions, could detect bulls whose daughter performance deviates significantly from early genomic predictions, and (2) to identify factors associated with the reference population that inform about inaccurate genomic predictions. The SD of bootstrap predictions was a mildly useful metric for identifying bulls whose future daughter performance may deviate significantly from early GPTA for protein and DPR. Leave-one-out cross validation allowed us to identify groups of reference population bulls that were influential on other reference population bulls for protein yield and observe their effects on predictions of testing set bulls, as a whole and individually.
自2009年引入奶牛基因组预测以来,基因组选择显著改变了奶牛遗传产业的许多方面,并提高了对大多数经济重要性状的选择反应速度。对年轻奶牛公牛进行基因分型,以获得其基因组预测传递能力(GPTA)和可靠性(REL)值。在公牛达到5岁且其有产奶记录的后代可用之前,这些GPTA是大多数采购、营销和淘汰决策的主要因素。届时,可以将女儿产奶偏差(DYD)与几年前计算的GPTA进行比较。对于大多数公牛来说,DYD与初始预测结果吻合良好。然而,对于一些公牛来说,DYD与相应的GPTA之间的差异相当大,并且已公布的REL在识别此类公牛方面价值有限。一种使用基因组最佳线性无偏预测(GBLUP)的自助聚合抽样(装袋)方法分别用于预测2963头、2963头和2803头年轻荷斯坦公牛的蛋白质产量、体细胞评分和女儿怀孕率(DPR)。对于每个性状,使用了来自参考群体的50个自助样本,该参考群体包括8610头、8405头和7945头年龄较大的荷斯坦公牛的2011个DYD。当从参考群体中剔除特定公牛时,还进行了留一法交叉验证以评估预测准确性。本研究的主要目的是:(1)评估当前REL值和变异性的替代度量(如预测的自助标准差(SD))能够检测其女儿性能与早期基因组预测有显著偏差的公牛的程度;(2)识别与参考群体相关的因素,这些因素可说明不准确的基因组预测情况。自助预测的SD是一种较为有用的指标,可用于识别其未来女儿性能可能与蛋白质和DPR的早期GPTA有显著偏差的公牛。留一法交叉验证使我们能够识别对蛋白质产量有影响的参考群体公牛组,并观察它们对测试集公牛预测的整体和个体影响。