School of Mathematical Sciences, Peking University, Beijing, China.
Mol Biol Evol. 2012 Oct;29(10):3131-42. doi: 10.1093/molbev/mss118. Epub 2012 Apr 13.
We implement an isolation with migration model for three species, with migration occurring between two closely related species while an out-group species is used to provide further information concerning gene trees and model parameters. The model is implemented in the likelihood framework for analyzing multilocus genomic sequence alignments, with one sequence sampled from each of the three species. The prior distribution of gene tree topology and branch lengths at every locus is calculated using a Markov chain characterization of the genealogical process of coalescent and migration, which integrates over the histories of migration events analytically. The likelihood function is calculated by integrating over branch lengths in the gene trees (coalescent times) numerically. We analyze the model to study the gene tree-species tree mismatch probability and the time to the most recent common ancestor at a locus. The model is used to construct a likelihood ratio test (LRT) of speciation with gene flow. We conduct computer simulations to evaluate the LRT and found that the test is in general conservative, with the false positive rate well below the significance level. For the test to have substantial power, hundreds of loci are needed. Application of the test to a human-chimpanzee-gorilla genomic data set suggests gene flow around the time of speciation of the human and the chimpanzee.
我们为三个物种实施了一个带有迁移的隔离模型,其中迁移发生在两个密切相关的物种之间,而一个外群物种用于提供有关基因树和模型参数的进一步信息。该模型在似然框架中实现,用于分析多基因座基因组序列比对,其中每个物种都有一个序列被采样。使用合并和迁移的谱系过程的马尔可夫链特征来计算每个基因树拓扑和分支长度的先验分布,这在分析上对迁移事件的历史进行了积分。通过在基因树上(合并时间)进行数值积分来计算似然函数。我们分析模型以研究基因树与物种树不匹配的概率和基因树中每个位点的最近共同祖先的时间。该模型用于构建带有基因流的物种形成的似然比检验(LRT)。我们进行了计算机模拟来评估 LRT,发现该检验通常是保守的,假阳性率远低于显著水平。为了使检验具有实质性的效力,需要数百个基因座。该检验应用于人类 - 黑猩猩 - 大猩猩基因组数据集表明,在人类和黑猩猩的物种形成时存在基因流。