School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA, USA.
Mol Ecol Resour. 2012 Nov;12(6):1114-23. doi: 10.1111/1755-0998.12002. Epub 2012 Sep 8.
Genotyping errors are present in almost all genetic data and can affect biological conclusions of a study, particularly for studies based on individual identification and parentage. Many statistical approaches can incorporate genotyping errors, but usually need accurate estimates of error rates. Here, we used a new microsatellite data set developed for brown rockfish (Sebastes auriculatus) to estimate genotyping error using three approaches: (i) repeat genotyping 5% of samples, (ii) comparing unintentionally recaptured individuals and (iii) Mendelian inheritance error checking for known parent-offspring pairs. In each data set, we quantified genotyping error rate per allele due to allele drop-out and false alleles. Genotyping error rate per locus revealed an average overall genotyping error rate by direct count of 0.3%, 1.5% and 1.7% (0.002, 0.007 and 0.008 per allele error rate) from replicate genotypes, known parent-offspring pairs and unintentionally recaptured individuals, respectively. By direct-count error estimates, the recapture and known parent-offspring data sets revealed an error rate four times greater than estimated using repeat genotypes. There was no evidence of correlation between error rates and locus variability for all three data sets, and errors appeared to occur randomly over loci in the repeat genotypes, but not in recaptures and parent-offspring comparisons. Furthermore, there was no correlation in locus-specific error rates between any two of the three data sets. Our data suggest that repeat genotyping may underestimate true error rates and may not estimate locus-specific error rates accurately. We therefore suggest using methods for error estimation that correspond to the overall aim of the study (e.g. known parent-offspring comparisons in parentage studies).
基因分型错误几乎存在于所有遗传数据中,并可能影响研究的生物学结论,特别是对于基于个体识别和亲子关系的研究。许多统计方法都可以整合基因分型错误,但通常需要准确估计错误率。在这里,我们使用了一个新的微卫星数据集来估计棕色岩鱼(Sebastes auriculatus)的基因分型错误,该数据集是为棕色岩鱼开发的,使用了三种方法:(i)重复分析 5%的样本,(ii)比较意外重新捕获的个体,(iii)对已知亲子对进行孟德尔遗传错误检查。在每个数据集,我们量化了每个等位基因由于等位基因丢失和假等位基因导致的基因分型错误率。每个基因座的基因分型错误率揭示了直接计数的平均总体基因分型错误率为 0.3%、1.5%和 1.7%(每个等位基因错误率为 0.002、0.007 和 0.008),分别来自重复基因型、已知亲子对和意外重新捕获的个体。通过直接计数错误估计,重新捕获和已知亲子数据集显示的错误率是使用重复基因型估计的四倍。在所有三个数据集,没有证据表明错误率与基因座变异性之间存在相关性,而且在重复基因型中,错误似乎随机出现在基因座上,而在重新捕获和亲子对比较中则没有。此外,在三个数据集之间,任何两个基因座的特异性错误率都没有相关性。我们的数据表明,重复基因分型可能低估了真实的错误率,并且可能无法准确估计基因座特异性错误率。因此,我们建议使用与研究总体目标相对应的错误估计方法(例如,亲子关系研究中的已知亲子比较)。