Department of Dairy Science, University of Wisconsin, Madison, Wisconsin 53706, USA.
J Dairy Sci. 2010 Nov;93(11):5423-35. doi: 10.3168/jds.2010-3149.
The objective of the present study was to evaluate the predictive ability of direct genomic values for economically important dairy traits when genotypes at some single nucleotide polymorphism (SNP) loci were imputed rather than measured directly. Genotypic data consisted of 42,552 SNP genotypes for each of 1,762 Jersey sires. Phenotypic data consisted of predicted transmitting abilities (PTA) for milk yield, protein percentage, and daughter pregnancy rate from May 2006 for 1,446 sires in the training set and from April 2009 for 316 sires in the testing set. The SNP effects were estimated using the Bayesian least absolute selection and shrinkage operator (LASSO) method with data of sires in the training set, and direct genomic values (DGV) for sires in the testing set were computed by multiplying these estimates by corresponding genotype dosages for sires in the testing set. The mean correlation across traits between DGV (before progeny testing) and PTA (after progeny testing) for sires in the testing set was 70.6% when all 42,552 SNP genotypes were used. When genotypes for 93.1, 96.6, 98.3, or 99.1% of loci were masked and subsequently imputed in the testing set, mean correlations across traits between DGV and PTA were 68.5, 64.8, 54.8, or 43.5%, respectively. When genotypes were also masked and imputed for a random 50% of sires in the training set, mean correlations across traits between DGV and PTA were 65.7, 63.2, 53.9, or 49.5%, respectively. Results of this study indicate that if a suitable reference population with high-density genotypes is available, a low-density chip comprising 3,000 equally spaced SNP may provide approximately 95% of the predictive ability observed with the BovineSNP50 Beadchip (Illumina Inc., San Diego, CA) in Jersey cattle. However, if fewer than 1,500 SNP are genotyped, the accuracy of DGV may be limited by errors in the imputed genotypes of selection candidates.
本研究的目的是评估在某些单核苷酸多态性(SNP)位点的基因型被推断而不是直接测量时,直接基因组值对经济上重要的奶牛性状的预测能力。基因型数据包括 1762 头泽西种公牛的每个 SNP 基因型 42552 个;表型数据包括 1446 头公牛在训练集中的 5 月 2006 年和 316 头公牛在测试集中的 4 月 2009 年的产奶量、蛋白百分比和女儿受孕率的预测传递能力(PTA)。使用训练集中的公牛数据,通过贝叶斯最小绝对选择和收缩算子(LASSO)方法估计 SNP 效应,并通过将这些估计乘以测试集中的相应基因型剂量来计算测试集中的直接基因组值(DGV)。当使用所有 42552 个 SNP 基因型时,测试集中的 DGV(在后代测试之前)和 PTA(在后代测试之后)之间的跨性状平均相关性为 70.6%。当在测试集中屏蔽和随后推断 93.1%、96.6%、98.3%或 99.1%的基因型时,DGV 和 PTA 之间的跨性状平均相关性分别为 68.5%、64.8%、54.8%或 43.5%。当也在训练集中屏蔽和推断 50%随机公牛的基因型时,DGV 和 PTA 之间的跨性状平均相关性分别为 65.7%、63.2%、53.9%或 49.5%。本研究结果表明,如果有一个具有高密度基因型的合适参考群体,则包含 3000 个等间距 SNP 的低密度芯片可能提供 95%左右的预测能力,在泽西牛中观察到的 BovineSNP50 Beadchip(Illumina Inc.,圣地亚哥,加利福尼亚州)。然而,如果少于 1500 个 SNP 被基因分型,则 DGV 的准确性可能受到选择候选者推断基因型的错误的限制。