Suppr超能文献

趋同进化和支系支持。

Homoplasy and clade support.

机构信息

Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720-3160, USA.

出版信息

Syst Biol. 2009 Apr;58(2):184-98. doi: 10.1093/sysbio/syp019. Epub 2009 Jun 29.

Abstract

Distinguishing phylogenetic signal from homoplasy (shared similarities among taxa that do not arise by common ancestry) is an implicit goal of any phylogenetic study. Large amounts of homoplasy can interfere with accurate tree inference, and it is expected that common measures of clade support, including bootstrap proportions and Bayesian posterior probabilities, should also be impacted to some degree by homoplasy. Through data simulation and analysis of 38 empirical data sets, we show that high amounts of homoplasy will affect all measures of clade support in a manner that is dependent on clade size. More specifically, the smallest taxon bipartitions in an unrooted tree topology will receive higher support relative to clades of intermediate sizes, even when all clades are supported by the same amount of data. We determine that the ultimate causes of this effect are the inclusion of random trees (due to homoplasy) during bootstrap resampling and Markov chain Monte Carlo (MCMC) topology searching and the higher relative proportion of small taxon bipartitions (i.e., 2 or 3 taxa) to larger sized bipartitions. However, the use of explicit model-based methods, especially Bayesian MCMC methods, effectively overcomes this clade size effect even when very small amounts of phylogenetic signal are present. We develop a post hoc statistic, the clade disparity index (CDI), to measure both the relative magnitude of the clade size effect and its statistical significance. In analyses of both simulated and empirical data, CDI values indicate that Bayesian MCMC analyses are substantially more likely to estimate clade support values that are uncorrelated with clade size than are maximum parsimony and maximum likelihood bootstrap analyses and thus less affected by homoplasy. These results may be especially relevant to "deep" phylogenetic problems, such as reconstructing the tree of life, as they represent the largest possible extremes of time and evolutionary rates, 2 factors that cause homoplasy.

摘要

区分系统发育信号和同形(分类群之间的相似性,这些相似性不是通过共同祖先产生的)是任何系统发育研究的隐含目标。大量的同形性会干扰准确的树推断,并且预计常见的分支支持度量,包括自举比例和贝叶斯后验概率,也会在某种程度上受到同形性的影响。通过数据模拟和 38 个实证数据集的分析,我们表明,大量的同形性会以依赖分支大小的方式影响所有分支支持度量。更具体地说,在无根树拓扑中,最小的分类群二分法相对于中等大小的分支获得更高的支持,即使所有分支都由相同数量的数据支持。我们确定,这种效应的最终原因是在自举重采样和马尔可夫链蒙特卡罗(MCMC)拓扑搜索过程中包含随机树(由于同形性)以及小分类群二分法(即 2 或 3 个分类群)相对于较大大小二分法的相对比例较高。然而,使用显式基于模型的方法,特别是贝叶斯 MCMC 方法,即使存在非常小的系统发育信号,也可以有效地克服这种分支大小效应。我们开发了一个事后统计量,即分支差异指数(CDI),以衡量分支大小效应的相对大小及其统计显著性。在模拟和实证数据分析中,CDI 值表明,贝叶斯 MCMC 分析比最大简约和最大似然自举分析更有可能估计与分支大小无关的分支支持值,因此受同形性的影响较小。这些结果可能与“深度”系统发育问题(如重建生命之树)特别相关,因为它们代表了时间和进化率的最大可能极端,这两个因素导致同形性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验