Balnakeil, Durness, Lairg, UK.
Syst Biol. 2011 Dec;60(6):735-46. doi: 10.1093/sysbio/syr086. Epub 2011 Aug 24.
It has long been recognized that phylogenetic trees are more unbalanced than those generated by a Yule process. Recently, the degree of this imbalance has been quantified using the large set of phylogenetic trees available in the TreeBASE data set. In this article, a more precise analysis of imbalance is undertaken. Trees simulated under a range of models are compared with trees from TreeBASE and two smaller data sets. Several simple models can match the amount of imbalance measured in real data. Most of them also match the variance of imbalance among empirical trees to a remarkable degree. Statistics are developed to measure balance and to distinguish between trees with the same overall imbalance. The match between models and data for these statistics is investigated. In particular, age-dependent (Bellman-Harris) branching process are studied in detail. It remains difficult to separate the process of macroevolution from biases introduced by sampling. The lessons for phylogenetic analysis are clearer. In particular, the use of the usual proportional to distinguishable arrangements (uniform) prior on tree topologies in Bayesian phylogenetic analysis is not recommended.
长期以来,人们已经认识到系统发育树比 Yule 过程生成的树更不平衡。最近,使用 TreeBASE 数据集提供的大量系统发育树,对这种不平衡的程度进行了量化。在本文中,对不平衡进行了更精确的分析。将模拟的树与 TreeBASE 中的树以及两个较小的数据集进行了比较。几种简单的模型可以匹配实际数据中测量到的不平衡程度。其中大多数模型还能很好地匹配经验树之间不平衡的方差。开发了统计学方法来衡量平衡,并区分具有相同整体不平衡的树。还研究了这些统计数据的模型与数据之间的匹配情况。特别地,详细研究了年龄相关(Bellman-Harris)分支过程。从采样偏差中分离宏观进化过程仍然很困难。对系统发育分析的教训更清楚。特别是,在贝叶斯系统发育分析中,不建议使用树拓扑结构的常用可区分排列比例(均匀)先验。