Ertl J, Edel C, Emmerling R, Pausch H, Fries R, Götz K-U
Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany.
Institute of Animal Breeding, Bavarian State Research Centre for Agriculture, 85586 Poing, Germany.
J Dairy Sci. 2014;97(1):487-96. doi: 10.3168/jds.2013-6855. Epub 2013 Nov 7.
This study investigated reliability of genomic predictions using medium-density (40,089; 50K) or high-density (HD; 388,951) marker sets. We developed an approximate method to test differences in validation reliability for significance. Model-based reliability and the effect of HD genotypes on inflation of predictions were analyzed additionally. Genomic breeding values were predicted for at least 1,321 validation bulls based on phenotypes and genotypes of at least 5,324 calibration bulls by means of a linear model in milk, fat, and protein yield; somatic cell score; milkability; muscling; udder, feet, and legs score as well as stature. In total, 1,485 bulls were actually HD genotyped and HD genotypes of the other animals were imputed from 50K genotypes using FImpute software. Validation reliability was measured as the coefficient of determination of the weighted regression of daughter yield deviations on predicted breeding values divided by the reliability of daughter yield deviations and inflation was evaluated by the slope of this regression. Model-based reliability was calculated from the model. Distributions for validation reliability of 50K markers were derived by repeated sampling of 50,000-marker samples from HD to test differences in validation reliability statistically. Additionally, the benefit of HD genotypes in validation reliability was tested by repeated sampling of validation groups and calculation of the difference in validation reliability between HD and 50K genotypes for the sampled groups of bulls. The mean benefit in validation reliability of HD genotypes was 0.015 compared with real 50K genotypes and 0.028 compared with 50K samples from HD affected by imputation error and was significant for all traits. The model-based reliability was, on average, 0.036 lower and the regression coefficient was 0.036 closer to the expected value with HD genotypes. The observed gain in validation reliability with HD genotypes was similar to expectations based on the number of markers and the effective number of segregating chromosome segments. Sampling error in the marker-based relationship coefficients causing overestimation of the model-based reliability was smaller with HD genotypes. Inflation of the genomic predictions was reduced with HD genotypes, accordingly. Similar effects on model-based reliability and inflation, but not on the validation reliability, were obtained by shrinkage estimation of the realized relationship matrix from 50K genotypes.
本研究调查了使用中等密度(40,089个标记;50K)或高密度(HD;388,951个标记)标记集进行基因组预测的可靠性。我们开发了一种近似方法来检验验证可靠性的差异是否显著。此外,还分析了基于模型的可靠性以及HD基因型对预测膨胀的影响。基于至少5324头校准公牛的表型和基因型,通过线性模型对至少1321头验证公牛的基因组育种值进行了预测,性状包括牛奶、脂肪和蛋白质产量;体细胞评分;挤奶能力;肌肉发达程度;乳房、蹄和腿部评分以及体高。总共1485头公牛实际进行了HD基因分型,其他动物的HD基因型使用FImpute软件从50K基因型中进行了推算。验证可靠性通过女儿产量偏差对预测育种值的加权回归的决定系数除以女儿产量偏差的可靠性来衡量,预测膨胀通过该回归的斜率来评估。基于模型的可靠性从模型中计算得出。通过从HD中重复抽样50,000个标记样本,得出50K标记的验证可靠性分布,以统计检验验证可靠性的差异。此外,通过重复抽样验证组并计算抽样公牛组中HD和50K基因型之间验证可靠性的差异,来检验HD基因型在验证可靠性方面的优势。与真实的50K基因型相比,HD基因型在验证可靠性方面的平均优势为0.015,与受推算误差影响的来自HD的50K样本相比为0.028,且对所有性状均显著。基于模型的可靠性平均低0.036,回归系数与HD基因型时的预期值接近0.036。观察到的HD基因型在验证可靠性方面的增益与基于标记数量和分离染色体片段有效数量的预期相似。HD基因型导致基于标记的亲缘关系系数中的抽样误差较小,从而高估了基于模型的可靠性。相应地,HD基因型降低了基因组预测的膨胀。通过对50K基因型的实际亲缘关系矩阵进行收缩估计,对基于模型的可靠性和膨胀有类似影响,但对验证可靠性没有影响。