Weng J F, Thomas D A, Mareels I
Department of Mechanical Engineering, The University of Melbourne, Melbourne, Australia.
J Comput Biol. 2011 Jan;18(1):67-80. doi: 10.1089/cmb.2009.0232. Epub 2010 Jul 12.
The problem of inferring phylogenies (phylogenetic trees) is one of the main problems in computational biology. There are three main methods for inferring phylogenies-Maximum Parsimony (MP), Distance Matrix (DM) and Maximum Likelihood (ML), of which the MP method is the most well-studied and popular method. In the MP method the optimization criterion is the number of substitutions of the nucleotides computed by the differences in the investigated nucleotide sequences. However, the MP method is often criticized as it only counts the substitutions observable at the current time and all the unobservable substitutions that really occur in the evolutionary history are omitted. In order to take into account the unobservable substitutions, some substitution models have been established and they are now widely used in the DM and ML methods but these substitution models cannot be used within the classical MP method. Recently the authors proposed a probability representation model for phylogenetic trees and the reconstructed trees in this model are called probability phylogenetic trees. One of the advantages of the probability representation model is that it can include a substitution model to infer phylogenetic trees based on the MP principle. In this paper we explain how to use a substitution model in the reconstruction of probability phylogenetic trees and show the advantage of this approach with examples.
推断系统发育树(系统发生树)的问题是计算生物学中的主要问题之一。推断系统发育树有三种主要方法——最大简约法(MP)、距离矩阵法(DM)和最大似然法(ML),其中MP方法是研究最深入且最受欢迎的方法。在MP方法中,优化标准是根据所研究核苷酸序列的差异计算出的核苷酸替换数。然而,MP方法经常受到批评,因为它只计算当前可观察到的替换,而进化历史中实际发生的所有不可观察到的替换都被忽略了。为了考虑不可观察到的替换,已经建立了一些替换模型,它们现在广泛应用于DM和ML方法中,但这些替换模型不能在经典的MP方法中使用。最近,作者提出了一种系统发育树的概率表示模型,该模型中重建的树被称为概率系统发育树。概率表示模型的优点之一是它可以包含一个替换模型,以便基于MP原则推断系统发育树。在本文中,我们解释了如何在概率系统发育树的重建中使用替换模型,并通过示例展示了这种方法的优势。