Department of Primatology, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.
Institut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas-Universitat Pompeu Fabra), Barcelona Biomedical Research Park, Barcelona, Spain.
Mol Ecol Resour. 2019 May;19(3):609-622. doi: 10.1111/1755-0998.12993.
Large-scale genomic studies of wild animal populations are often limited by access to high-quality DNA. Although noninvasive samples, such as faeces, can be readily collected, DNA from the sample producers is usually present in low quantities, fragmented, and contaminated by microorganism and dietary DNAs. Hybridization capture can help to overcome these impediments by increasing the proportion of subject DNA prior to high-throughput sequencing. Here we evaluate a key design variable for hybridization capture, the number of rounds of capture, by testing whether one or two rounds are most appropriate, given varying sample quality (as measured by the ratios of subject to total DNA). We used a set of 1,780 quality-assessed wild chimpanzee (Pan troglodytes schweinfurthii) faecal samples and chose 110 samples of varying quality for exome capture and sequencing. We used multiple regression to assess the effects of the ratio of subject to total DNA (sample quality), rounds of capture and sequencing effort on the number of unique exome reads sequenced. We not only show that one round of capture is preferable when the proportion of subject DNA in a sample is above ~2%-3%, but also explore various types of bias introduced by capture, and develop a model that predicts the sequencing effort necessary for a desired data yield from samples of a given quality. Thus, our results provide a useful guide and pave a methodological way forward for researchers wishing to plan similar hybridization capture studies.
对野生动物种群进行大规模基因组研究通常受到高质量 DNA 样本获取的限制。尽管可以方便地采集非侵入性样本,如粪便,但样本生产者的 DNA 通常数量较少、碎片化且受到微生物和膳食 DNA 的污染。杂交捕获可以通过在高通量测序之前增加目标 DNA 的比例来克服这些障碍。在这里,我们通过测试一轮或两轮捕获是否最合适(根据目标 DNA 与总 DNA 的比例来衡量样本质量),评估了杂交捕获的一个关键设计变量,即捕获轮数。我们使用了一组 1780 个经过质量评估的野生黑猩猩(Pan troglodytes schweinfurthii)粪便样本,并选择了 110 个具有不同质量的样本进行外显子组捕获和测序。我们使用多元回归来评估目标 DNA 与总 DNA 的比例(样本质量)、捕获轮数和测序工作量对测序的独特外显子读数数量的影响。我们不仅表明,当样本中目标 DNA 的比例高于~2%-3%时,一轮捕获是优选的,还探讨了捕获带来的各种类型的偏差,并开发了一种模型,可预测给定质量样本中所需测序工作量以获得预期数据产量。因此,我们的研究结果为希望进行类似杂交捕获研究的研究人员提供了有用的指导,并为他们开辟了一条方法学前进的道路。