Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA.
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):611-8. doi: 10.1109/TCBB.2010.2.
Coalescent likelihood is the probability of observing the given population sequences under the coalescent model. Computation of coalescent likelihood under the infinite sites model is a classic problem in coalescent theory. Existing methods are based on either importance sampling or Markov chain Monte Carlo and are inexact. In this paper, we develop a simple method that can compute the exact coalescent likelihood for many data sets of moderate size, including real biological data whose likelihood was previously thought to be difficult to compute exactly. Our method works for both panmictic and subdivided populations. Simulations demonstrate that the practical range of exact coalescent likelihood computation for panmictic populations is significantly larger than what was previously believed. We investigate the application of our method in estimating mutation rates by maximum likelihood. A main application of the exact method is comparing the accuracy of approximate methods. To demonstrate the usefulness of the exact method, we evaluate the accuracy of program Genetree in computing the likelihood for subdivided populations.
合并可能性是在合并模型下观察给定种群序列的概率。无限位点模型下的合并可能性计算是合并理论中的一个经典问题。现有的方法基于重要性抽样或马尔可夫链蒙特卡罗,并且不精确。在本文中,我们开发了一种简单的方法,可以为许多中等规模的数据集计算准确的合并可能性,包括以前认为难以准确计算可能性的真实生物数据。我们的方法适用于均匀混合和细分的群体。模拟表明,均匀混合群体准确合并可能性计算的实际范围比以前认为的要大得多。我们研究了通过最大似然法估计突变率的方法在我们方法中的应用。准确方法的主要应用之一是比较近似方法的准确性。为了展示准确方法的有用性,我们评估了程序 Genetree 在计算细分群体可能性方面的准确性。