Sánchez Barreiro Fátima, Vieira Filipe G, Martin Michael D, Haile James, Gilbert M Thomas P, Wales Nathan
Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350, Copenhagen K, Denmark.
NTNU University Museum, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway.
Mol Ecol Resour. 2017 Mar;17(2):209-220. doi: 10.1111/1755-0998.12610. Epub 2016 Nov 7.
Population genetic studies of nonmodel organisms frequently employ reduced representation library (RRL) methodologies, many of which rely on protocols in which genomic DNA is digested by one or more restriction enzymes. However, because high molecular weight DNA is recommended for these protocols, samples with degraded DNA are generally unsuitable for RRL methods. Given that ancient and historic specimens can provide key temporal perspectives to evolutionary questions, we explored how custom-designed RNA probes could enrich for RRL loci (Restriction Enzyme-Associated Loci baits, or REALbaits). Starting with genotyping-by-sequencing (GBS) data generated on modern common ragweed (Ambrosia artemisiifolia L.) specimens, we designed 20 000 RNA probes to target well-characterized genomic loci in herbarium voucher specimens dating from 1835 to 1913. Compared to shotgun sequencing, we observed enrichment of the targeted loci at 19- to 151-fold. Using our GBS capture pipeline on a data set of 38 herbarium samples, we discovered 22 813 SNPs, providing sufficient genomic resolution to distinguish geographic populations. For these samples, we found that dilution of REALbaits to 10% of their original concentration still yielded sufficient data for downstream analyses and that a sequencing depth of ~7m reads was sufficient to characterize most loci without wasting sequencing capacity. In addition, we observed that targeted loci had highly variable rates of success, which we primarily attribute to similarity between loci, a trait that ultimately interferes with unambiguous read mapping. Our findings can help researchers design capture experiments for RRL loci, thereby providing an efficient means to integrate samples with degraded DNA into existing RRL data sets.
对非模式生物的群体遗传学研究经常采用简化基因组文库(RRL)方法,其中许多方法依赖于用一种或多种限制性内切酶消化基因组DNA的方案。然而,由于这些方案推荐使用高分子量DNA,DNA降解的样本通常不适合RRL方法。鉴于古代和历史标本可以为进化问题提供关键的时间视角,我们探索了定制设计的RNA探针如何富集RRL位点(限制性内切酶相关位点诱饵,或REALbaits)。从现代豚草(Ambrosia artemisiifolia L.)标本上生成的简化基因组测序(GBS)数据开始,我们设计了20000个RNA探针,以靶向1835年至1913年标本馆凭证标本中特征明确的基因组位点。与鸟枪法测序相比,我们观察到靶向位点的富集倍数为19至151倍。在一个包含38个标本馆样本的数据集上使用我们的GBS捕获流程,我们发现了22813个单核苷酸多态性(SNP),提供了足够的基因组分辨率来区分地理种群。对于这些样本,我们发现将REALbaits稀释至其原始浓度的10%仍能产生足够的数据用于下游分析,并且~7m reads的测序深度足以表征大多数位点而不浪费测序能力。此外,我们观察到靶向位点的成功率差异很大,我们主要将其归因于位点之间的相似性,这一特征最终会干扰明确的读段映射。我们的研究结果可以帮助研究人员设计针对RRL位点的捕获实验,从而提供一种有效的方法,将DNA降解的样本整合到现有的RRL数据集中。