INRA, UMR1313 Génétique Animale et Biologie Intégrative (GABI), 78350 Jouy-en-Josas, France.
J Dairy Sci. 2011 Jul;94(7):3679-86. doi: 10.3168/jds.2011-4299.
The purpose of this study was to investigate the imputation error and loss of reliability of direct genomic values (DGV) or genomically enhanced breeding values (GEBV) when using genotypes imputed from a 3,000-marker single nucleotide polymorphism (SNP) panel to a 50,000-marker SNP panel. Data consisted of genotypes of 15,966 European Holstein bulls from the combined EuroGenomics reference population. Genotypes with the low-density chip were created by erasing markers from 50,000-marker data. The studies were performed in the Nordic countries (Denmark, Finland, and Sweden) using a BLUP model for prediction of DGV and in France using a genomic marker-assisted selection approach for prediction of GEBV. Imputation in both studies was done using a combination of the DAGPHASE 1.1 and Beagle 2.1.3 software. Traits considered were protein yield, fertility, somatic cell count, and udder depth. Imputation of missing markers and prediction of breeding values were performed using 2 different reference populations in each country: either a national reference population or a combined EuroGenomics reference population. Validation for accuracy of imputation and genomic prediction was done based on national test data. Mean imputation error rates when using national reference animals was 5.5 and 3.9% in the Nordic countries and France, respectively, whereas imputation based on the EuroGenomics reference data set gave mean error rates of 4.0 and 2.1%, respectively. Prediction of GEBV based on genotypes imputed with a national reference data set gave an absolute loss of 0.05 in mean reliability of GEBV in the French study, whereas a loss of 0.03 was obtained for reliability of DGV in the Nordic study. When genotypes were imputed using the EuroGenomics reference, a loss of 0.02 in mean reliability of GEBV was detected in the French study, and a loss of 0.06 was observed for the mean reliability of DGV in the Nordic study. Consequently, the reliability of DGV using the imputed SNP data was 0.38 based on national reference data, and 0.48 based on EuroGenomics reference data in the Nordic validation, and the reliability of GEBV using the imputed SNP data was 0.41 based on national reference data, and 0.44 based on EuroGenomics reference data in the French validation.
本研究旨在探讨使用从 3000 个单核苷酸多态性(SNP)标记的基因型数据到 50000 个 SNP 标记的基因型数据进行基因组增强育种值(GEBV)预测时,直接基因组值(DGV)或基因组增强育种值(GEBV)的插补误差和可靠性损失。数据包括来自合并的 EuroGenomics 参考群体的 15966 头欧洲荷斯坦公牛的基因型。使用低密度芯片创建的基因型是通过从 50000 个标记数据中删除标记而创建的。这些研究是在北欧国家(丹麦、芬兰和瑞典)使用 BLUP 模型进行 DGV 预测,以及在法国使用基因组标记辅助选择方法进行 GEBV 预测进行的。在这两项研究中,插补都是使用 DAGPHASE 1.1 和 Beagle 2.1.3 软件的组合进行的。所考虑的性状包括蛋白质产量、生育率、体细胞计数和乳房深度。在每个国家,使用两个不同的参考群体进行缺失标记的插补和育种值的预测:要么是国家参考群体,要么是合并的 EuroGenomics 参考群体。基于国家测试数据对插补和基因组预测的准确性进行了验证。使用国家参考动物进行插补时的平均插补误差率分别为北欧国家和法国的 5.5%和 3.9%,而基于 EuroGenomics 参考数据集的插补则分别给出了 4.0%和 2.1%的平均误差率。基于使用国家参考数据集进行的基因型插补预测 GEBV,法国研究中 GEBV 的可靠性绝对损失了 0.05,而北欧研究中 DGV 的可靠性损失了 0.03。当使用 EuroGenomics 参考数据集进行基因型插补时,法国研究中 GEBV 的可靠性平均损失了 0.02,而北欧研究中 DGV 的可靠性平均损失了 0.06。因此,基于国家参考数据,使用插补 SNP 数据的 DGV 的可靠性为 0.38,基于 Nordic 验证的 EuroGenomics 参考数据为 0.48,使用插补 SNP 数据的 GEBV 的可靠性为 0.41,基于国家参考数据,基于 EuroGenomics 参考数据在法国验证中为 0.44。