Department of Fish and Wildlife Sciences, University of Idaho, Moscow, Idaho.
Idaho Department of Fish and Game, Lewiston, Idaho.
Mol Ecol Resour. 2018 Nov;18(6):1263-1281. doi: 10.1111/1755-0998.12910. Epub 2018 Jul 9.
The development of high-throughput sequencing technologies is dramatically increasing the use of single nucleotide polymorphisms (SNPs) across the field of genetics, but most parentage studies of wild populations still rely on microsatellites. We developed a bioinformatic pipeline for identifying SNP panels that are informative for parentage analysis from restriction site-associated DNA sequencing (RADseq) data. This pipeline includes options for analysis with or without a reference genome, and provides methods to maximize genotyping accuracy and select sets of unlinked loci that have high statistical power. We test this pipeline on small populations of Mexican gray wolf and bighorn sheep, for which parentage analyses are expected to be challenging due to low genetic diversity and the presence of many closely related individuals. We compare the results of parentage analysis across SNP panels generated with or without the use of a reference genome, and between SNPs and microsatellites. For Mexican gray wolf, we conducted parentage analyses for 30 pups from a single cohort where samples were available from 64% of possible mothers and 53% of possible fathers, and the accuracy of parentage assignments could be estimated because true identities of parents were known a priori based on field data. For bighorn sheep, we conducted maternity analyses for 39 lambs from five cohorts where 77% of possible mothers were sampled, but true identities of parents were unknown. Analyses with and without a reference genome produced SNP panels with ≥95% parentage assignment accuracy for Mexican gray wolf, outperforming microsatellites at 78% accuracy. Maternity assignments were completely consistent across all SNP panels for the bighorn sheep, and were 74.4% consistent with assignments from microsatellites. Accuracy and consistency of parentage analysis were not reduced when using as few as 284 SNPs for Mexican gray wolf and 142 SNPs for bighorn sheep, indicating our pipeline can be used to develop SNP genotyping assays for parentage analysis with relatively small numbers of loci.
高通量测序技术的发展极大地促进了单核苷酸多态性(SNP)在遗传学领域的应用,但大多数野生种群的亲子关系研究仍依赖于微卫星。我们开发了一种生物信息学管道,用于从限制位点相关 DNA 测序(RADseq)数据中识别可用于亲子分析的 SNP 面板。该管道包括使用或不使用参考基因组进行分析的选项,并提供了最大限度提高基因分型准确性和选择具有高统计功效的非连锁基因座集的方法。我们在墨西哥灰狼和大角羊的小种群中测试了这个管道,由于遗传多样性低和存在许多近亲个体,亲子关系分析预计会很有挑战性。我们比较了使用或不使用参考基因组生成的 SNP 面板以及 SNP 和微卫星之间的亲子关系分析结果。对于墨西哥灰狼,我们对来自一个单一队列的 30 只幼崽进行了亲子关系分析,其中 64%的可能母亲和 53%的可能父亲的样本可用,并且可以估计亲子关系分配的准确性,因为根据实地数据,父母的真实身份事先是已知的。对于大角羊,我们对来自五个队列的 39 只羔羊进行了母性分析,其中 77%的可能母亲被采样,但父母的真实身份未知。使用和不使用参考基因组的分析为墨西哥灰狼产生了 SNP 面板,亲子关系分配准确率≥95%,优于微卫星的 78%准确率。对于大角羊,所有 SNP 面板的母性分配完全一致,与微卫星的分配一致 74.4%。当使用墨西哥灰狼的 284 个 SNP 和大角羊的 142 个 SNP 时,亲子关系分析的准确性和一致性并没有降低,这表明我们的管道可以用于开发相对较少基因座的 SNP 基因分型测定,用于亲子关系分析。