Department of Forest and Conservation Sciences, Faculty of Forestry, University of British Columbia, Vancouver, BC, Canada.
Mol Ecol Resour. 2018 May;18(3):525-540. doi: 10.1111/1755-0998.12758. Epub 2018 Feb 12.
Approximate Bayesian computation (ABC) is widely used to infer demographic history of populations and species using DNA markers. Genomic markers can now be developed for nonmodel species using reduced representation library (RRL) sequencing methods that select a fraction of the genome using targeted sequence capture or restriction enzymes (genotyping-by-sequencing, GBS). We explored the influence of marker number and length, knowledge of gametic phase, and tradeoffs between sample size and sequencing depth on the quality of demographic inferences performed with ABC. We focused on two-population models of recent spatial expansion with varying numbers of unknown parameters. Performing ABC on simulated data sets with known parameter values, we found that the timing of a recent spatial expansion event could be precisely estimated in a three-parameter model. Taking into account uncertainty in parameters such as initial population size and migration rate collectively decreased the precision of inferences dramatically. Phasing haplotypes did not improve results, regardless of sequence length. Numerous short sequences were as valuable as fewer, longer sequences, and performed best when a large sample size was sequenced at low individual depth, even when sequencing errors were added. ABC results were similar to results obtained with an alternative method based on the site frequency spectrum (SFS) when performed with unphased GBS-type markers. We conclude that unphased GBS-type data sets can be sufficient to precisely infer simple demographic models, and discuss possible improvements for the use of ABC with genomic data.
近似贝叶斯计算(ABC)被广泛用于通过 DNA 标记推断群体和物种的人口历史。现在,使用简化基因组文库(RRL)测序方法可以为非模式物种开发基因组标记,该方法使用靶向序列捕获或限制性内切酶(测序分型,GBS)选择基因组的一部分。我们探讨了标记数量和长度、配子相知识以及样本量和测序深度之间权衡对使用 ABC 进行人口统计学推断质量的影响。我们专注于具有不同未知参数数量的近期空间扩张的两群体模型。在具有已知参数值的模拟数据集上进行 ABC 分析,我们发现可以在三参数模型中精确估计最近空间扩张事件的时间。考虑到初始种群大小和迁移率等参数的不确定性,集体降低了推断的精度。不论序列长度如何,相位单倍型都没有改善结果。许多短序列与少数长序列一样有价值,并且在对大量样本进行低个体深度测序时表现最佳,即使添加了测序错误。当使用未相位的 GBS 型标记进行分析时,ABC 结果与基于位点频率谱(SFS)的替代方法获得的结果相似。我们得出结论,未相位的 GBS 型数据集足以精确推断简单的人口统计学模型,并讨论了使用基因组数据进行 ABC 的可能改进。