Institute for Mathematics and Computer Science, Greifswald University, Walther-Rathenau-Str. 47, 17489, Greifswald, Germany.
Bull Math Biol. 2017 Dec;79(12):2865-2886. doi: 10.1007/s11538-017-0354-6. Epub 2017 Oct 5.
One of the main aims in phylogenetics is the estimation of ancestral sequences based on present-day data like, for instance, DNA alignments. One way to estimate the data of the last common ancestor of a given set of species is to first reconstruct a phylogenetic tree with some tree inference method and then to use some method of ancestral state inference based on that tree. One of the best-known methods both for tree inference and for ancestral sequence inference is Maximum Parsimony (MP). In this manuscript, we focus on this method and on ancestral state inference for fully bifurcating trees. In particular, we investigate a conjecture published by Charleston and Steel in 1995 concerning the number of species which need to have a particular state, say a, at a particular site in order for MP to unambiguously return a as an estimate for the state of the last common ancestor. We prove the conjecture for all even numbers of character states, which is the most relevant case in biology. We also show that the conjecture does not hold in general for odd numbers of character states, but also present some positive results for this case.
系统发育学的主要目标之一是根据当前数据(例如 DNA 比对)估计祖先序列。估计给定物种集合的最后共同祖先的数据的一种方法是首先使用某种树推断方法重建系统发育树,然后使用基于该树的某种祖先状态推断方法。最大简约法(MP)是最著名的树推断和祖先序列推断方法之一。在本文中,我们专注于这种方法以及完全分叉树的祖先状态推断。具体来说,我们研究了 Charleston 和 Steel 于 1995 年发表的一个猜想,该猜想涉及在特定位置需要具有特定状态(例如 a)的物种数量,以便 MP 能够明确将 a 作为最后共同祖先的状态估计。我们证明了对于所有偶数字符状态的情况,这是生物学中最相关的情况。我们还表明,对于奇数字符状态的情况,该猜想通常不成立,但也为这种情况提出了一些积极的结果。