Boison S A, Santos D J A, Utsunomiya A H T, Carvalheiro R, Neves H H R, O'Brien A M Perez, Garcia J F, Sölkner J, da Silva M V G B
University of Natural Resources and Life Sciences, Department of Sustainable Agricultural Systems, Gregor-Mendel 33, A-1180, Vienna, Austria.
Faculdade de Ciências Agrária e Veterinárias, Universidade Estadual Paulista (UNESP), SP, 148841900, Brazil.
J Dairy Sci. 2015 Jul;98(7):4969-89. doi: 10.3168/jds.2014-9213. Epub 2015 May 7.
Genotype imputation is widely used as a cost-effective strategy in genomic evaluation of cattle. Key determinants of imputation accuracies, such as linkage disequilibrium patterns, marker densities, and ascertainment bias, differ between Bos indicus and Bos taurus breeds. Consequently, there is a need to investigate effectiveness of genotype imputation in indicine breeds. Thus, the objective of the study was to investigate strategies and factors affecting the accuracy of genotype imputation in Gyr (Bos indicus) dairy cattle. Four imputation scenarios were studied using 471 sires and 1,644 dams genotyped on Illumina BovineHD (HD-777K; San Diego, CA) and BovineSNP50 (50K) chips, respectively. Scenarios were based on which reference high-density single nucleotide polymorphism (SNP) panel (HDP) should be adopted [HD-777K, 50K, and GeneSeek GGP-75Ki (Lincoln, NE)]. Depending on the scenario, validation animals had their genotypes masked for one of the lower-density panels: Illumina (3K, 7K, and 50K) and GeneSeek (SGGP-20Ki and GGP-75Ki). We randomly selected 171 sires as reference and 300 as validation for all the scenarios. Additionally, all sires were used as reference and the 1,644 dams were imputed for validation. Genotypes of 98 individuals with 4 and more offspring were completely masked and imputed. Imputation algorithms FImpute and Beagle v3.3 and v4 were used. Imputation accuracies were measured using the correlation and allelic correct rate. FImpute resulted in highest accuracies, whereas Beagle 3.3 gave the least-accurate imputations. Accuracies evaluated as correlation (allelic correct rate) ranged from 0.910 (0.942) to 0.961 (0.974) using 50K as HDP and with 3K (7K) as low-density panels. With GGP-75Ki as HDP, accuracies were moderate for 3K, 7K, and 50K, but high for SGGP-20Ki. The use of HD-777K as HDP resulted in accuracies of 0.888 (3K), 0.941 (7K), 0.980 (SGGP-20Ki), 0.982 (50K), and 0.993 (GGP-75Ki). Ungenotyped individuals were imputed with an average accuracy of 0.970. The average top 5 kinship coefficients between reference and imputed individuals was a strong predictor of imputation accuracy. FImpute was faster and used less memory than Beagle v4. Beagle v4 outperformed Beagle v3.3 in accuracy and speed of computation. A genotyping strategy that uses the HD-777K SNP chip as a reference panel and SGGP-20Ki as the lower-density SNP panel should be adopted as accuracy was high and similar to that of the 50K. However, the effect of using imputed HD-777K genotypes from the SGGP-20Ki on genomic evaluation is yet to be studied.
基因型填充作为一种经济高效的策略,在奶牛基因组评估中得到广泛应用。印度瘤牛(Bos indicus)和普通牛(Bos taurus)品种在填充准确性的关键决定因素方面存在差异,如连锁不平衡模式、标记密度和定位偏差等。因此,有必要研究基因型填充在印度瘤牛品种中的有效性。本研究的目的是调查影响吉尔(Bos indicus)奶牛基因型填充准确性的策略和因素。使用分别在Illumina BovineHD(HD - 777K;加利福尼亚州圣地亚哥)和BovineSNP50(50K)芯片上进行基因分型的471头公牛和1644头母牛,研究了四种填充方案。方案基于应采用哪种参考高密度单核苷酸多态性(SNP)面板[HD - 777K、50K和GeneSeek GGP - 75Ki(内布拉斯加州林肯)]。根据方案,验证动物的基因型针对其中一种低密度面板进行屏蔽:Illumina(3K、7K和50K)和GeneSeek(SGGP - 20Ki和GGP - 75Ki)。我们随机选择171头公牛作为所有方案的参考,300头作为验证。此外,所有公牛用作参考,对1644头母牛进行填充以作验证。对98个有4个及以上后代的个体的基因型进行完全屏蔽并填充。使用了填充算法FImpute以及Beagle v3.3和v4。使用相关性和等位基因正确率来衡量填充准确性。FImpute的准确性最高,而Beagle 3.3的填充准确性最低。以50K作为高密度面板且以3K(7K)作为低密度面板时,以相关性(等位基因正确率)评估的准确性范围为0.910(0.942)至0.961(0.974)。以GGP - 75Ki作为高密度面板时,3K、7K和50K的准确性中等,但SGGP - 20Ki的准确性较高。使用HD - 777K作为高密度面板时,3K的准确性为0.888,7K为0.941,SGGP - 20Ki为0.980,50K为0.982,GGP - 75Ki为0.993。未进行基因分型的个体填充后的平均准确性为0.970。参考个体与填充个体之间的前5个亲缘系数平均值是填充准确性的有力预测指标。FImpute比Beagle v4更快且使用内存更少。Beagle v4在准确性和计算速度方面优于Beagle v3.3。应采用以HD - 777K SNP芯片作为参考面板且以SGGP - 20Ki作为低密度SNP面板的基因分型策略,因为其准确性高且与50K的相似。然而,使用从SGGP - 20Ki填充的HD - 777K基因型对基因组评估的影响还有待研究。