Pattengale Nicholas D, Alipour Masoud, Bininda-Emonds Olaf R P, Moret Bernard M E, Stamatakis Alexandros
Department of Computer Science, University of New Mexico, Albuquerque, New Mexico 87123, USA.
J Comput Biol. 2010 Mar;17(3):337-54. doi: 10.1089/cmb.2009.0179.
Phylogenetic bootstrapping (BS) is a standard technique for inferring confidence values on phylogenetic trees that is based on reconstructing many trees from minor variations of the input data, trees called replicates. BS is used with all phylogenetic reconstruction approaches, but we focus here on one of the most popular, maximum likelihood (ML). Because ML inference is so computationally demanding, it has proved too expensive to date to assess the impact of the number of replicates used in BS on the relative accuracy of the support values. For the same reason, a rather small number (typically 100) of BS replicates are computed in real-world studies. Stamatakis et al. recently introduced a BS algorithm that is 1 to 2 orders of magnitude faster than previous techniques, while yielding qualitatively comparable support values, making an experimental study possible. In this article, we propose stopping criteria--that is, thresholds computed at runtime to determine when enough replicates have been generated--and we report on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of our proposed criteria. We run our tests on 17 diverse real-world DNA--single-gene as well as multi-gene--datasets, which include 125-2,554 taxa. We find that our stopping criteria typically stop computations after 100-500 replicates (although the most conservative criterion may continue for several thousand replicates) while producing support values that correlate at better than 99.5% with the reference values on the best ML trees. Significantly, we also find that the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes. Our results are thus twofold: (i) they give the first experimental assessment of the effect of the number of BS replicates on the quality of support values returned through BS, and (ii) they validate our proposals for stopping criteria. Practitioners will no longer have to enter a guess nor worry about the quality of support values; moreover, with most counts of replicates in the 100-500 range, robust BS under ML inference becomes computationally practical for most datasets. The complete test suite is available at http://lcbb.epfl.ch/BS.tar.bz2, and BS with our stopping criteria is included in the latest release of RAxML v7.2.5, available at http://wwwkramer.in.tum.de/exelixis/software.html.
系统发育自展法(BS)是一种推断系统发育树置信值的标准技术,它基于从输入数据的微小变化中重建许多棵树,这些树称为重复树。BS可与所有系统发育重建方法一起使用,但我们在此重点关注最流行的方法之一——最大似然法(ML)。由于ML推断对计算要求极高,迄今为止,评估BS中使用的重复树数量对支持值相对准确性的影响成本过高。出于同样的原因,在实际研究中计算的BS重复树数量相当少(通常为100棵)。斯塔马塔基斯等人最近引入了一种BS算法,其速度比以前的技术快1到2个数量级,同时产生质量相当的支持值,使得进行实验研究成为可能。在本文中,我们提出了停止标准——即在运行时计算的阈值,以确定何时已生成足够的重复树——并且我们报告了第一项大规模实验研究,以评估重复树数量对支持值质量的影响,包括我们提出的标准的性能。我们在17个不同的真实世界DNA数据集(单基因和多基因数据集)上进行测试,这些数据集包含125 - 2554个分类单元。我们发现,我们的停止标准通常在100 - 500次重复后停止计算(尽管最保守的标准可能会持续数千次重复),同时产生的支持值与最佳ML树上的参考值的相关性超过99.5%。值得注意的是,我们还发现,对于大小相当的不同数据集,停止标准可能会推荐非常不同的重复树数量。因此,我们的结果有两方面:(i)它们首次对BS重复树数量对通过BS返回的支持值质量的影响进行了实验评估,(ii)它们验证了我们提出的停止标准。从业者将不再需要猜测,也不必担心支持值的质量;此外,对于大多数重复树数量在100 - 500范围内的情况,ML推断下的稳健BS对于大多数数据集在计算上变得可行。完整的测试套件可在http://lcbb.epfl.ch/BS.tar.bz2获取,带有我们停止标准的BS包含在RAxML v7.2.5的最新版本中,可在http://wwwkramer.in.tum.de/exelixis/software.html获取。