Rosenberg Noah A, Tao Randa
Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109-2218, USA.
Syst Biol. 2008 Feb;57(1):131-40. doi: 10.1080/10635150801905535.
Under a coalescent model for within-species evolution, gene trees may differ from species trees to such an extent that the gene tree topology most likely to evolve along the branches of a species tree can disagree with the species tree topology. Gene tree topologies that are more likely to be produced than the topology that matches that of the species tree are termed anomalous, and the region of branch-length space that gives rise to anomalous gene trees (AGTs) is the anomaly zone. We examine the occurrence of anomalous gene trees for the case of five taxa, the smallest number of taxa for which every species tree topology has a nonempty anomaly zone. Considering all sets of branch lengths that give rise to anomalous gene trees, the largest value possible for the smallest branch length in the species tree is greater in the five-taxon case (0.1934 coalescent time units) than in the previously studied case of four taxa (0.1568). The five-taxon case demonstrates the existence of three phenomena that do not occur in the four-taxon case. First, anomalous gene trees can have the same unlabeled topology as the species tree. Second, the anomaly zone does not necessarily enclose a ball centered at the origin in branch-length space, in which all branches are short. Third, as a branch length increases, it is possible for the number of AGTs to increase rather than decrease or remain constant. These results, which help to describe how the properties of anomalous gene trees increase in complexity as the number of taxa increases, will be useful in formulating strategies for evading the problem of anomalous gene trees during species tree inference from multilocus data.
在物种内进化的合并模型下,基因树可能与物种树存在很大差异,以至于最有可能沿着物种树分支进化的基因树拓扑结构可能与物种树拓扑结构不一致。比与物种树拓扑结构匹配的拓扑结构更有可能产生的基因树拓扑结构被称为异常的,而产生异常基因树(AGT)的分支长度空间区域就是异常区。我们研究了五个分类单元情况下异常基因树的出现情况,五个分类单元是每个物种树拓扑结构都有非空异常区的最小分类单元数量。考虑所有产生异常基因树的分支长度集合,物种树中最小分支长度的最大可能值在五个分类单元的情况下(0.1934个合并时间单位)比之前研究的四个分类单元的情况(0.1568)更大。五个分类单元的情况展示了四种分类单元情况下不会出现的三种现象。第一,异常基因树可以具有与物种树相同的未标记拓扑结构。第二,异常区不一定在分支长度空间中包围一个以原点为中心的球,在这个球中所有分支都很短。第三,随着一个分支长度增加,AGT的数量有可能增加而不是减少或保持不变。这些结果有助于描述随着分类单元数量增加异常基因树的属性如何变得更加复杂,将有助于制定从多位点数据推断物种树时规避异常基因树问题的策略。