Suppr超能文献

不同搜索策略在估计简约重抽样、自展法和布雷默支持率方面的效率。

The efficiency of different search strategies in estimating parsimony jackknife, bootstrap, and Bremer support.

作者信息

Müller Kai F

机构信息

Nees-Institut für Biodiversität der Pflanzen, Rheinische Friedrich-Wilhelms-Universität Bonn, Meckenheimer Allee 170, Bonn, D-53115, Germany.

出版信息

BMC Evol Biol. 2005 Oct 29;5:58. doi: 10.1186/1471-2148-5-58.

Abstract

BACKGROUND

For parsimony analyses, the most common way to estimate confidence is by resampling plans (nonparametric bootstrap, jackknife), and Bremer support (Decay indices). The recent literature reveals that parameter settings that are quite commonly employed are not those that are recommended by theoretical considerations and by previous empirical studies. The optimal search strategy to be applied during resampling was previously addressed solely via standard search strategies available in PAUP*. The question of a compromise between search extensiveness and improved support accuracy for Bremer support received even less attention. A set of experiments was conducted on different datasets to find an empirical cut-off point at which increased search extensiveness does not significantly change Bremer support and jackknife or bootstrap proportions any more.

RESULTS

For the number of replicates needed for accurate estimates of support in resampling plans, a diagram is provided that helps to address the question whether apparently different support values really differ significantly. It is shown that the use of random addition cycles and parsimony ratchet iterations during bootstrapping does not translate into higher support, nor does any extension of the search extensiveness beyond the rather moderate effort of TBR (tree bisection and reconnection branch swapping) plus saving one tree per replicate. Instead, in case of very large matrices, saving more than one shortest tree per iteration and using a strict consensus tree of these yields decreased support compared to saving only one tree. This can be interpreted as a small risk of overestimating support but should be more than compensated by other factors that counteract an enhanced type I error. With regard to Bremer support, a rule of thumb can be derived stating that not much is gained relative to the surplus computational effort when searches are extended beyond 20 ratchet iterations per constrained node, at least not for datasets that fall within the size range found in the current literature.

CONCLUSION

In view of these results, calculating bootstrap or jackknife proportions with narrow confidence intervals even for very large datasets can be achieved with less expense than often thought. In particular, iterated bootstrap methods that aim at reducing statistical bias inherent to these proportions are more feasible when the individual bootstrap searches require less time.

摘要

背景

对于简约分析,估计置信度最常用的方法是重抽样计划(非参数自展法、刀切法)以及布雷默支持度(衰减指数)。最近的文献表明,相当普遍采用的参数设置并非理论考量和先前实证研究推荐的设置。重抽样过程中应用的最优搜索策略此前仅通过PAUP*中可用的标准搜索策略来探讨。对于布雷默支持度,在搜索广度和提高支持度准确性之间进行折中的问题受到的关注更少。针对不同数据集进行了一系列实验,以找到一个经验性的临界点,超过该点后增加搜索广度不会再显著改变布雷默支持度以及刀切法或自展法比例。

结果

针对重抽样计划中准确估计支持度所需的重复次数,提供了一个图表,有助于解决明显不同的支持度值是否真的存在显著差异这一问题。结果表明,在自展过程中使用随机添加循环和简约棘轮迭代并不会转化为更高的支持度,而且搜索广度超过相当适度的TBR(树二分与重连分支交换)加上每次重复保存一棵树的工作量之后,也不会有任何提升。相反,在矩阵非常大的情况下,与每次迭代仅保存一棵最短树相比,每次迭代保存多棵最短树并使用这些树的严格合意树会导致支持度降低。这可以解释为存在高估支持度的小风险,但应该会被抵消I型错误增加的其他因素所弥补。关于布雷默支持度,可以得出一个经验法则,即当每个受约束节点的搜索扩展超过20次棘轮迭代时,相对于额外的计算工作量,收获并不多,至少对于当前文献中发现的大小范围内的数据集是这样。

结论

鉴于这些结果,即使对于非常大的数据集,以比通常认为的更低成本计算具有窄置信区间的自展法或刀切法比例也是可以实现的。特别是,当单个自展搜索所需时间较少时,旨在减少这些比例中固有统计偏差的迭代自展法更可行。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验