Literman Robert A, Ott Brittany M, Wen Jun, Grauke L J, Schwartz Rachel S, Handy Sara M
Office of Regulatory Science, Center for Food Safety and Applied Nutrition U.S. Food and Drug Administration College Park Maryland USA.
Office of Food Additive Safety, Center for Food Safety and Applied Nutrition U.S. Food and Drug Administration College Park Maryland USA.
Appl Plant Sci. 2022 Jan 20;10(1):e11455. doi: 10.1002/aps3.11455. eCollection 2022 Jan-Feb.
DNA-based species identification is critical when morphological identification is restricted, but DNA-based identification pipelines typically rely on the ability to compare homologous sequence data across species. Because many clades lack robust genomic resources, we present here a bioinformatics pipeline capable of generating genome-wide single-nucleotide polymorphism (SNP) data while circumventing the need for any reference genome or annotation data.
Using the SISRS bioinformatics pipeline, we generated de novo ortholog data for the genus , isolating sites where genetic variation was restricted to a single species (i.e., species-informative SNPs). We leveraged these SNPs to identify both full-species and hybrid specimens, even at very low sequencing depths.
We identified between 46,000 and 476,000 species-identifying SNPs for each of eight diploid species, and all species identifications were concordant with the species of record. For all putative F hybrid specimens, both parental species were correctly identified in all cases, and more punctate patterns of introgression were detectable in more cryptic crosses.
Bioinformatics pipelines that use only short-read sequencing data provide vital new tools enabling rapid expansion of DNA identification assays for model and non-model clades alike.
当形态学鉴定受限 时,基于DNA的物种鉴定至关重要,但基于DNA的鉴定流程通常依赖于跨物种比较同源序列数据的能力。由于许多进化枝缺乏强大的基因组资源,我们在此展示一种生物信息学流程,该流程能够生成全基因组单核苷酸多态性(SNP)数据,同时无需任何参考基因组或注释数据。
使用SISRS生物信息学流程,我们为 属生成了从头直系同源数据,分离出遗传变异仅限于单个物种的位点(即物种特异性SNP)。我们利用这些SNP来鉴定完整物种和杂交 标本,即使在测序深度非常低的情况下也是如此。
我们为八个二倍体 物种中的每一个鉴定出46,000至476,000个物种鉴定SNP,所有物种鉴定结果均与记录的物种一致。对于所有假定的F杂交标本,在所有情况下都能正确鉴定出两个亲本物种,并且在更多隐性杂交中可检测到更点状的渐渗模式。
仅使用短读长测序数据的生物信息学流程提供了重要的新工具,能够使模型和非模型进化枝的DNA鉴定分析都得以快速扩展。