Faculty of Veterinary Medicine, University of Liège, Liège, Belgium.
J Dairy Sci. 2010 Nov;93(11):5443-54. doi: 10.3168/jds.2010-3255.
Imputation of missing genotypes is important to join data from animals genotyped on different single nucleotide polymorphism (SNP) panels. Because of the evolution of available technologies, economical reasons, or coexistence of several products from competing organizations, animals might be genotyped for different SNP chips. Combined analysis of all the data increases accuracy of genomic selection or fine-mapping precision. In the present study, real data from 4,738 Dutch Holstein animals genotyped with custom-made 60K Illumina panels (Illumina, San Diego, CA) were used to mimic imputation of genotypes between 2 SNP panels of approximately 27,500 markers each and with 9,265 SNP markers in common. Imputation efficiency increased with number of reference animals (genotyped for both chips), when animals genotyped on a single chip were included in the training data, with regional higher marker densities, with greater distance to chromosome ends, and with a closer relationship between imputed and reference animals. With 0 to 2,000 animals genotyped for both chips, the mean imputation error rate ranged from 2.774 to 0.415% and accuracy ranged from 0.81 to 0.96. Then, imputation was applied in the Dutch Holstein population to predict alleles from markers of the Illumina Bovine SNP50 chip with markers from a custom-made 60K Illumina panel. A cross-validation study performed on 102 bulls indicated that the mean error rate per bull was approximately equal to 1.0%. This study showed the feasibility to impute markers in dairy cattle with the current marker panels and with error rates below 1%.
缺失基因型的推断对于将不同单核苷酸多态性 (SNP) 面板上基因型的动物数据进行合并非常重要。由于现有技术的发展、经济原因或竞争组织的几个产品共存,动物可能会针对不同的 SNP 芯片进行基因分型。所有数据的联合分析可提高基因组选择或精细定位的准确性。在本研究中,使用来自 4738 头荷兰荷斯坦奶牛的真实数据,这些奶牛使用定制的 60K Illumina 面板(Illumina,圣地亚哥,CA)进行基因分型,模拟了两个大约 27500 个标记的 SNP 面板之间的基因型推断,并且有 9265 个 SNP 标记是共有的。当将在单个芯片上基因分型的动物包含在训练数据中时,随着参考动物数量(两种芯片都进行了基因分型)的增加,推断效率增加,标记密度较高,与染色体末端的距离较大,并且与参考动物的关系更为密切。当使用 0 到 2000 头两种芯片都进行了基因分型的动物时,平均推断错误率范围从 2.774%到 0.415%,准确性范围从 0.81 到 0.96。然后,将其应用于荷兰荷斯坦牛群体,以从 Illumina Bovine SNP50 芯片的标记预测 Illumina 定制的 60K 面板上的等位基因。在 102 头公牛上进行的交叉验证研究表明,每头公牛的平均错误率约为 1.0%。这项研究表明,在奶牛中使用当前标记面板和低于 1%的错误率推断标记是可行的。