Department of Mathematics and Statistics, University of Alaska Fairbanks, P.O. Box 756660, Fairbanks, AK, 99775, USA.
Department of Mathematics and Statistics, The University of New Mexico, Albuquerque, NM, 87131, USA.
Bull Math Biol. 2018 Jan;80(1):64-103. doi: 10.1007/s11538-017-0363-5. Epub 2017 Nov 10.
Using topological summaries of gene trees as a basis for species tree inference is a promising approach to obtain acceptable speed on genomic-scale datasets, and to avoid some undesirable modeling assumptions. Here we study the probabilities of splits on gene trees under the multispecies coalescent model, and how their features might inform species tree inference. After investigating the behavior of split consensus methods, we investigate split invariants-that is, polynomial relationships between split probabilities. These invariants are then used to show that, even though a split is an unrooted notion, split probabilities retain enough information to identify the rooted species tree topology for trees of 5 or more taxa, with one possible 6-taxon exception.
利用基因树的拓扑摘要作为物种树推断的基础,是一种在基因组规模数据集上获得可接受速度的有前途的方法,并避免一些不理想的建模假设。在这里,我们研究了多物种合并模型下基因树上分裂的概率,以及它们的特征如何为物种树推断提供信息。在研究了分裂共识方法的行为之后,我们研究了分裂不变量,即分裂概率之间的多项式关系。然后,这些不变量被用来表明,即使分裂是无根的概念,分裂概率仍然保留足够的信息来识别有 5 个或更多分类群的树的根物种树拓扑结构,只有一种可能的 6 个分类群的例外。