Bioinformatics Research Center, Aarhus University, Aarhus, Denmark.
PLoS Genet. 2012;8(12):e1003125. doi: 10.1371/journal.pgen.1003125. Epub 2012 Dec 20.
We present a hidden Markov model (HMM) for inferring gradual isolation between two populations during speciation, modelled as a time interval with restricted gene flow. The HMM describes the history of adjacent nucleotides in two genomic sequences, such that the nucleotides can be separated by recombination, can migrate between populations, or can coalesce at variable time points, all dependent on the parameters of the model, which are the effective population sizes, splitting times, recombination rate, and migration rate. We show by extensive simulations that the HMM can accurately infer all parameters except the recombination rate, which is biased downwards. Inference is robust to variation in the mutation rate and the recombination rate over the sequence and also robust to unknown phase of genomes unless they are very closely related. We provide a test for whether divergence is gradual or instantaneous, and we apply the model to three key divergence processes in great apes: (a) the bonobo and common chimpanzee, (b) the eastern and western gorilla, and (c) the Sumatran and Bornean orang-utan. We find that the bonobo and chimpanzee appear to have undergone a clear split, whereas the divergence processes of the gorilla and orang-utan species occurred over several hundred thousands years with gene flow stopping quite recently. We also apply the model to the Homo/Pan speciation event and find that the most likely scenario involves an extended period of gene flow during speciation.
我们提出了一个隐马尔可夫模型(HMM),用于推断物种形成过程中两个群体之间逐渐隔离的情况,模型将其表示为基因流动受限的时间间隔。HMM 描述了两个基因组序列中相邻核苷酸的历史,使得核苷酸可以通过重组分离,可以在种群之间迁移,或者可以在不同的时间点合并,所有这些都取决于模型的参数,即有效种群大小、分裂时间、重组率和迁移率。我们通过广泛的模拟表明,HMM 可以准确推断除重组率以外的所有参数,而重组率存在向下偏差。在序列中的突变率和重组率以及未知的基因组相位变化下,推断都是稳健的,除非它们非常接近。我们提供了一种测试分歧是逐渐的还是瞬时的方法,并将模型应用于大猩猩中的三个关键分歧过程:(a)倭黑猩猩和普通黑猩猩,(b)东部和西部大猩猩,以及(c)苏门答腊猩猩和婆罗洲猩猩。我们发现倭黑猩猩和黑猩猩似乎经历了明显的分裂,而大猩猩和猩猩物种的分化过程发生在几十万年前,最近才停止了基因流动。我们还将模型应用于人类/黑猩猩的物种形成事件,并发现最有可能的情况是在物种形成过程中存在一个扩展的基因流动时期。