Suppr超能文献

分子系统发育学中 Bootstrap 支持值的渐进行为。

The Asymptotic Behavior of Bootstrap Support Values in Molecular Phylogenetics.

机构信息

Department of Mathematics, Beijing Jiaotong University, Beijing, 100044, China.

Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.

出版信息

Syst Biol. 2021 Jun 16;70(4):774-785. doi: 10.1093/sysbio/syaa100.

Abstract

The phylogenetic bootstrap is the most commonly used method for assessing statistical confidence in estimated phylogenies by non-Bayesian methods such as maximum parsimony and maximum likelihood (ML). It is observed that bootstrap support tends to be high in large genomic data sets whether or not the inferred trees and clades are correct. Here, we study the asymptotic behavior of bootstrap support for the ML tree in large data sets when the competing phylogenetic trees are equally right or equally wrong. We consider phylogenetic reconstruction as a problem of statistical model selection when the compared models are nonnested and misspecified. The bootstrap is found to have qualitatively different dynamics from Bayesian inference and does not exhibit the polarized behavior of posterior model probabilities, consistent with the empirical observation that the bootstrap is more conservative than Bayesian probabilities. Nevertheless, bootstrap support similarly shows fluctuations among large data sets, with no convergence to a point value, when the compared models are equally right or equally wrong. Thus, in large data sets strong support for wrong trees or models is likely to occur. Our analysis provides a partial explanation for the high bootstrap support values for incorrect clades observed in empirical data analysis. [Bootstrap; model selection; star-tree paradox; support value.].

摘要

系统发育自举是最常用的方法,用于通过非贝叶斯方法(如最大简约法和最大似然法)评估估计系统发育的统计置信度。观察到,无论推断的树和分支是否正确,自举支持在大型基因组数据集上往往很高。在这里,当竞争系统发育树同样正确或同样错误时,我们研究了大型数据集上 ML 树的自举支持的渐近行为。当比较模型是非嵌套和有偏的时,我们将系统发育重建视为统计模型选择问题。发现自举具有与贝叶斯推断不同的定性动态,并且不表现出后验模型概率的极化行为,这与经验观察一致,即自举比贝叶斯概率更保守。然而,当比较模型同样正确或同样错误时,自举支持在大型数据集之间也表现出波动,而不会收敛到一个点值。因此,在大型数据集中,很可能会出现对错误树或模型的强烈支持。我们的分析为在经验数据分析中观察到的不正确分支的高自举支持值提供了部分解释。[自举;模型选择;星状树悖论;支持值。]

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验