Department of Genetics, Rutgers, State University of New Jersey, Piscataway, New Jersey 08854.
Genetics. 2010 Feb;184(2):363-79. doi: 10.1534/genetics.109.110528. Epub 2009 Nov 16.
Most methods for studying divergence with gene flow rely upon data from many individuals at few loci. Such data can be useful for inferring recent population history but they are unlikely to contain sufficient information about older events. However, the growing availability of genome sequences suggests a different kind of sampling scheme, one that may be more suited to studying relatively ancient divergence. Data sets extracted from whole-genome alignments may represent very few individuals but contain a very large number of loci. To take advantage of such data we developed a new maximum-likelihood method for genomic data under the isolation-with-migration model. Unlike many coalescent-based likelihood methods, our method does not rely on Monte Carlo sampling of genealogies, but rather provides a precise calculation of the likelihood by numerical integration over all genealogies. We demonstrate that the method works well on simulated data sets. We also consider two models for accommodating mutation rate variation among loci and find that the model that treats mutation rates as random variables leads to better estimates. We applied the method to the divergence of Drosophila melanogaster and D. simulans and detected a low, but statistically significant, signal of gene flow from D. simulans to D. melanogaster.
大多数带有基因流的分歧研究方法都依赖于少数几个基因座上的许多个体的数据。这种数据对于推断近期种群历史可能很有用,但不太可能包含有关较旧事件的足够信息。然而,基因组序列的日益普及提出了一种不同的采样方案,这种方案可能更适合研究相对较古老的分歧。从全基因组比对中提取的数据集中可能只有很少的个体,但包含了大量的基因座。为了利用这些数据,我们开发了一种新的最大似然方法,用于隔离与迁移模型下的基因组数据。与许多基于合并的似然方法不同,我们的方法不依赖于谱系的蒙特卡罗抽样,而是通过对所有谱系进行数值积分来提供似然的精确计算。我们证明该方法在模拟数据集上效果良好。我们还考虑了两种适应基因座间突变率变化的模型,发现将突变率视为随机变量的模型会产生更好的估计值。我们将该方法应用于黑腹果蝇和拟暗果蝇的分歧研究中,并检测到来自拟暗果蝇向黑腹果蝇的低但具有统计学意义的基因流信号。