Baele Guy, Carvalho Luiz M, Brusselmans Marius, Dudas Gytis, Ji Xiang, McCrone John T, Lemey Philippe, Suchard Marc A, Rambaut Andrew
Department of Microbiology, Immunology and Transplantation, Rega Institute, KU Leuven, Leuven, Belgium.
School of Applied Mathematics, Getulio Vargas Foundation (FGV), Rio de Janeiro, Brazil.
Bioinformatics. 2025 Sep 9. doi: 10.1093/bioinformatics/btaf488.
In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated summary phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel summary tree method-the highest independent posterior subtree reconstruction, or HIPSTR-contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both summary trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the summary tree.
HIPSTR and MCC reconstructions on two Ebola virus and two SARS-CoV-2 data sets show that HIPSTR yields summary trees that consistently contain clades with higher support compared to MCC trees. The MCC trees regularly fail to include several clades with very high posterior probability (≥0.95) as well as a large number of clades with moderate to high posterior probability (≥50%), whereas HIPSTR-in particular its majority-rule extension MrHIPSTR -achieves near-perfect performance in this respect. HIPSTR and MrHIPSTR also exhibit favorable computational performance over MCC in TreeAnnotator X. Comparison to the recent CCD0-MAP algorithm yielded mixed results and requires a more in-depth investigation in follow-up studies.
TreeAnnotator X is available as part of the BEAST X (v10.5.0) software package, available at https://github.com/beast-dev/beast-mcmc/releases, and on Zenodo (DOI: https://doi.org/10.5281/zenodo.4895234).
在贝叶斯系统发育和系统动力学研究中,用时间校准的系统发育总结来概括树的后验分布是很常见的。虽然最大分支可信度(MCC)树常被用于此目的,但我们在此表明,一种新颖的总结树方法——最高独立后验子树重建法(HIPSTR),其包含的分支支持度始终高于MCC树。我们还在TreeAnnotator X的更新版本中提供了更快的计算程序,用于估计这两种总结树,TreeAnnotator X是一个开源软件程序,它总结树样本中的信息,并返回许多有用的统计数据,如总结树中各个分支的可信度。
对两个埃博拉病毒和两个严重急性呼吸综合征冠状病毒2(SARS-CoV-2)数据集进行的HIPSTR和MCC重建表明,与MCC树相比,HIPSTR生成的总结树始终包含支持度更高的分支。MCC树经常未能纳入几个后验概率非常高(≥0.95)的分支以及大量后验概率中等至高(≥50%)的分支,而HIPSTR——特别是其多数规则扩展MrHIPSTR——在这方面实现了近乎完美的性能。在TreeAnnotator X中,HIPSTR和MrHIPSTR在计算性能上也优于MCC。与最近的CCD0-MAP算法比较,结果好坏参半,需要在后续研究中进行更深入的调查。
TreeAnnotator X作为BEAST X(v10.5.0)软件包的一部分提供,可在https://github.com/beast-dev/beast-mcmc/releases获取,也可在Zenodo(DOI:https://doi.org/10.5281/zenodo.4895234)上获取。