Chassier Marjorie, Barrey Eric, Robert Céline, Duluard Arnaud, Danvy Sophie, Ricard Anne
Unité Mixte de Recherche 1313 Génétique Animale et Biologie Intégrative, Département Sciences du Vivant, Institut National de la Recherche Agronomique, AgroParisTech, Université Paris Saclay, Jouy-en-Josas, France.
Ecole Nationale Vétérinaire d'Alfort, Maisons Alfort, France.
J Anim Breed Genet. 2018 Oct;135(6):420-431. doi: 10.1111/jbg.12358. Epub 2018 Oct 9.
Genotype imputation is now a key component of genomic analyses as it increases the density of available genotypes within a population. However, many factors can influence imputation accuracy. The aim of this study was to assess and compare the accuracy of imputation of high-density genotypes (Affymetrix Axiom Equine genotyping array, 670,806 SNPs) from two moderate-density genotypes (Illumina Equine SNP50 BeadChip, 54,602 SNPs and Illumina Equine SNP70 BeadChip, 65,157 SNPs), using single-breed or multiple-breed reference sets. Genotypes were available from five groups of horse breeds: Arab (AR, 1,207 horses), Trotteur Français (TF, 979 horses), Selle Français (SF, 1,979 horses), Anglo-Arab (AA, 229 horses) and various foreign sport horses (FH, 209 horses). The proportions of horses genotyped with the high-density (HD) chip in each breed group were 10% in AA, 15% in AR and FH, 30% in TF and 57% in SF. A validation set consisting of one-third of the horses genotyped with the HD chip was formed and their genotypes deleted. Two imputation strategies were compared, one in which the reference population consisted only of horses from the same breed group as in the validation set, and another with horses from all breed groups. For the first strategy, concordance rates (CRs) ranged from 97.8% (AR) to 99.0% (TF) and correlations (r²) from 0.94 (AR) to 0.99 (TF). For the second strategy, CR ranged from 97.4% (AR) to 98.9% (TF) and r² from 0.93 (AR) to 0.99 (TF). Overall, the results show a small advantage of within-breed imputation compared with multi-breed imputation. Adding horses from different breed groups to the reference population does not improve the accuracy of imputation. Imputation provides an accurate means of combining data sets from different genotyping platforms, now necessary with the increasing use of the recently developed Affymetrix Axiom Equine genotyping array.
基因型填充如今已成为基因组分析的关键组成部分,因为它能提高群体内可用基因型的密度。然而,许多因素会影响填充准确性。本研究的目的是评估和比较使用单品种或多品种参考集,从两种中等密度基因型(Illumina Equine SNP50 BeadChip,54,602个单核苷酸多态性;Illumina Equine SNP70 BeadChip,65,157个单核苷酸多态性)填充高密度基因型(Affymetrix Axiom Equine基因分型芯片,670,806个单核苷酸多态性)的准确性。基因型数据来自五组马品种:阿拉伯马(AR,1207匹马)、法国快步马(TF,979匹马)、法国赛拉法兰西马(SF,1979匹马)、英阿拉伯马(AA,229匹马)和各种外国运动马(FH,209匹马)。每个品种组中使用高密度(HD)芯片进行基因分型的马匹比例分别为:AA中10%,AR和FH中15%,TF中30%,SF中57%。构建了一个由三分之一使用HD芯片进行基因分型的马匹组成的验证集,并删除了它们的基因型。比较了两种填充策略,一种是参考群体仅由与验证集中相同品种组的马匹组成,另一种是参考群体由所有品种组的马匹组成。对于第一种策略,一致性率(CRs)范围为97.8%(AR)至99.0%(TF),相关性(r²)范围为0.94(AR)至0.99(TF)。对于第二种策略,CR范围为97.4%(AR)至98.9%(TF),r²范围为0.93(AR)至0.99(TF)。总体而言,结果表明与多品种填充相比,品种内填充有小的优势。将不同品种组的马匹添加到参考群体中并不能提高填充的准确性。填充为合并来自不同基因分型平台的数据集提供了一种准确的方法,随着最近开发的Affymetrix Axiom Equine基因分型芯片的使用增加,这一点现在变得很有必要。