The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel.
School of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel.
Bioinformatics. 2024 Jun 28;40(Suppl 1):i208-i217. doi: 10.1093/bioinformatics/btae255.
Currently used methods for estimating branch support in phylogenetic analyses often rely on the classic Felsenstein's bootstrap, parametric tests, or their approximations. As these branch support scores are widely used in phylogenetic analyses, having accurate, fast, and interpretable scores is of high importance.
Here, we employed a data-driven approach to estimate branch support values with a probabilistic interpretation. To this end, we simulated thousands of realistic phylogenetic trees and the corresponding multiple sequence alignments. Each of the obtained alignments was used to infer the phylogeny using state-of-the-art phylogenetic inference software, which was then compared to the true tree. Using these extensive data, we trained machine-learning algorithms to estimate branch support values for each bipartition within the maximum-likelihood trees obtained by each software. Our results demonstrate that our model provides fast and more accurate probability-based branch support values than commonly used procedures. We demonstrate the applicability of our approach on empirical datasets.
The data supporting this work are available in the Figshare repository at https://doi.org/10.6084/m9.figshare.25050554.v1, and the underlying code is accessible via GitHub at https://github.com/noaeker/bootstrap_repo.
目前用于估计系统发育分析中分支支持的方法通常依赖于经典的费希尔氏-bootstrap、参数检验或其近似方法。由于这些分支支持得分在系统发育分析中被广泛使用,因此具有准确、快速和可解释的得分非常重要。
在这里,我们采用了一种数据驱动的方法来估计具有概率解释的分支支持值。为此,我们模拟了数千个真实的系统发育树和相应的多序列比对。从获得的每个比对中,我们使用最先进的系统发育推断软件推断系统发育,然后将其与真实树进行比较。使用这些广泛的数据,我们训练机器学习算法来估计每个软件获得的最大似然树中每个二分法的分支支持值。我们的结果表明,与常用的方法相比,我们的模型提供了更快和更准确的基于概率的分支支持值。我们在经验数据集上演示了我们方法的适用性。
支持这项工作的数据可在 Figshare 存储库中获得,网址为 https://doi.org/10.6084/m9.figshare.25050554.v1,基础代码可通过 GitHub 获得,网址为 https://github.com/noaeker/bootstrap_repo。