Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, USA.
Genetics. 2010 May;185(1):313-26. doi: 10.1534/genetics.109.112532. Epub 2010 Feb 22.
For many biological investigations, groups of individuals are genetically sampled from several geographic locations. These sampling locations often do not reflect the genetic population structure. We describe a framework using marginal likelihoods to compare and order structured population models, such as testing whether the sampling locations belong to the same randomly mating population or comparing unidirectional and multidirectional gene flow models. In the context of inferences employing Markov chain Monte Carlo methods, the accuracy of the marginal likelihoods depends heavily on the approximation method used to calculate the marginal likelihood. Two methods, modified thermodynamic integration and a stabilized harmonic mean estimator, are compared. With finite Markov chain Monte Carlo run lengths, the harmonic mean estimator may not be consistent. Thermodynamic integration, in contrast, delivers considerably better estimates of the marginal likelihood. The choice of prior distributions does not influence the order and choice of the better models when the marginal likelihood is estimated using thermodynamic integration, whereas with the harmonic mean estimator the influence of the prior is pronounced and the order of the models changes. The approximation of marginal likelihood using thermodynamic integration in MIGRATE allows the evaluation of complex population genetic models, not only of whether sampling locations belong to a single panmictic population, but also of competing complex structured population models.
对于许多生物学研究,通常会从多个地理地点对个体进行基因抽样,但这些抽样地点并不反映遗传种群结构。我们描述了一种使用边缘似然来比较和排序结构种群模型的框架,例如测试抽样地点是否属于同一随机交配种群,或者比较单向和多向基因流动模型。在使用马尔可夫链蒙特卡罗方法进行推理的情况下,边缘似然的准确性在很大程度上取决于用于计算边缘似然的近似方法。我们比较了两种方法,即改进的热力学积分法和稳定化调和平均值估计器。在有限的马尔可夫链蒙特卡罗运行长度下,调和平均值估计器可能不一致。相比之下,热力学积分法可以提供更准确的边缘似然估计。当使用热力学积分法估计边缘似然时,先验分布的选择不会影响更好模型的排序和选择,而使用调和平均值估计器时,先验的影响显著,模型的排序会发生变化。MIGRATE 中使用热力学积分法对边缘似然的逼近允许评估复杂的种群遗传模型,不仅可以评估抽样地点是否属于单一混合种群,还可以评估竞争的复杂结构种群模型。