Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA.
Foundation Medicine Inc, San Diego, CA 92121, USA.
G3 (Bethesda). 2022 Apr 4;12(4). doi: 10.1093/g3journal/jkac040.
Divergence time estimation from multilocus genetic data has become common in population genetics and phylogenetics. We present a new Bayesian inference method that treats the divergence time as a random variable. The divergence time is calculated from an assembly of splitting events on individual lineages in a genealogy. The time for such a splitting event is drawn from a hazard function of the truncated normal distribution. This allows easy integration into the standard coalescence framework used in programs such as Migrate. We explore the accuracy of the new inference method with simulated population splittings over a wide range of divergence time values and with a reanalysis of a dataset of 5 populations consisting of 3 present-day populations (Africans, Europeans, Asian) and 2 archaic samples (Altai and Ust'Isthim). Evaluations of simple divergence models without subsequent geneflow show high accuracy, whereas the accuracy of the results of isolation with migration models depends on the magnitude of the immigration rate. High immigration rates lead to a time of the most recent common ancestor of the sample that, looking backward in time, predates the divergence time. Even with many independent loci, accurate estimation of the divergence time with high immigration rates becomes problematic. Our comparison to other software tools reveals that our lineage-switching method, implemented in Migrate, is comparable to IMa2p. The software Migrate can run large numbers of sequence loci (>1,000) on computer clusters in parallel.
从多基因座遗传数据估计分歧时间在群体遗传学和系统发育学中已经很常见。我们提出了一种新的贝叶斯推断方法,将分歧时间视为一个随机变量。分歧时间是从系统发育中个体谱系上的一系列分裂事件计算得出的。这种分裂事件的时间是从截断正态分布的风险函数中抽取的。这使得它可以很容易地集成到 Migrate 等程序中使用的标准合并框架中。我们通过在广泛的分歧时间值范围内模拟种群分裂,并重新分析由 3 个现存种群(非洲人、欧洲人和亚洲人)和 2 个古样本(阿尔泰山和乌斯季伊希姆)组成的 5 个种群数据集,探索了新推断方法的准确性。对没有后续基因流动的简单分歧模型的评估显示出很高的准确性,而隔离与迁移模型的结果的准确性取决于移民率的大小。高移民率导致样本的最近共同祖先的时间,从过去看,早于分歧时间。即使有许多独立的基因座,高移民率下的准确分歧时间估计也变得成问题。我们与其他软件工具的比较表明,我们在 Migrate 中实现的谱系转换方法与 IMa2p 相当。Migrate 软件可以在计算机集群上并行运行大量序列基因座(>1000)。