Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, PO Box 65, 8200 AB Lelystad, The Netherlands.
J Dairy Sci. 2012 Feb;95(2):876-89. doi: 10.3168/jds.2011-4490.
Genomic selection using 50,000 single nucleotide polymorphism (50k SNP) chips has been implemented in many dairy cattle breeding programs. Cheap, low-density chips make genotyping of a larger number of animals cost effective. A commonly proposed strategy is to impute low-density genotypes up to 50,000 genotypes before predicting direct genomic values (DGV). The objectives of this study were to investigate the accuracy of imputation for animals genotyped with a low-density chip and to investigate the effect of imputation on reliability of DGV. Low-density chips contained 384, 3,000, or 6,000 SNP. The SNP were selected based either on the highest minor allele frequency in a bin or the middle SNP in a bin, and DAGPHASE, CHROMIBD, and multivariate BLUP were used for imputation. Genotypes of 9,378 animals were used, from which approximately 2,350 animals had deregressed proofs. Bayesian stochastic search variable selection was used for estimating SNP effects of the 50k chip. Imputation accuracies and imputation error rates were poor for low-density chips with 384 SNP. Imputation accuracies were higher with 3,000 and 6,000 SNP. Performance of DAGPHASE and CHROMIBD was very similar and much better than that of multivariate BLUP for both imputation accuracy and reliability of DGV. With 3,000 SNP and using CHROMIBD or DAGPHASE for imputation, 84 to 90% of the increase in DGV reliability using the 50k chip, compared with a pedigree index, was obtained. With multivariate BLUP, the increase in reliability was only 40%. With 384 SNP, the reliability of DGV was lower than for a pedigree index, whereas with 6,000 SNP, about 93% of the increase in reliability of DGV based on the 50k chip was obtained when using DAGPHASE for imputation. Using genotype probabilities to predict gene content increased imputation accuracy and the reliability of DGV and is therefore recommended for applications of imputation for genomic prediction. A deterministic equation was derived to predict accuracy of DGV based on imputation accuracy, which fitted closely with the observed relationship. The deterministic equation can be used to evaluate the effect of differences in imputation accuracy on accuracy and reliability of DGV.
使用 50,000 个单核苷酸多态性(50k SNP)芯片的基因组选择已在许多奶牛育种计划中实施。廉价、低密度的芯片使对更多动物进行基因分型具有成本效益。一种常见的提议策略是在预测直接基因组值(DGV)之前,将低密度基因型内插至 50,000 个基因型。本研究的目的是研究用低密度芯片进行基因分型的动物的内插准确性,并研究内插对 DGV 可靠性的影响。低密度芯片包含 384、3000 或 6000 个 SNP。SNP 是根据 bin 中的最高次要等位基因频率或 bin 中的中间 SNP 选择的,DAGPHASE、CHROMIBD 和多变量 BLUP 用于内插。使用了 9378 头动物的基因型,其中大约 2350 头动物有去回归证明。贝叶斯随机搜索变量选择用于估计 50k 芯片的 SNP 效应。具有 384 SNP 的低密度芯片的内插准确性和内插错误率较差。具有 3000 和 6000 SNP 的内插准确性更高。DAGPHASE 和 CHROMIBD 的性能非常相似,并且对于 DGV 的内插准确性和可靠性都优于多变量 BLUP。使用 3000 SNP 并使用 CHROMIBD 或 DAGPHASE 进行内插,与系谱指数相比,使用 50k 芯片可获得 84%至 90%的 DGV 可靠性增加。使用多变量 BLUP,可靠性的增加仅为 40%。使用 384 SNP,DGV 的可靠性低于系谱指数,而使用 6000 SNP,当使用 DAGPHASE 进行内插时,基于 50k 芯片的 DGV 可靠性增加约 93%。使用基因型概率预测基因含量可提高内插准确性和 DGV 的可靠性,因此建议用于基因组预测的内插应用。推导了一个确定性方程来根据内插准确性预测 DGV 的准确性,该方程与观察到的关系非常吻合。确定性方程可用于评估内插准确性差异对 DGV 准确性和可靠性的影响。