Dipartimento di Ingegneria e Architettura, Università degli Studi di Trieste, Trieste 34127, Italy.
Dipartimento di Matematica e Geoscienze, Università degli Studi di Trieste, Trieste 34128, Italy.
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad660.
The Balanced Minimum Evolution (BME) is a powerful distance based phylogenetic estimation model introduced by Desper and Gascuel and nowadays implemented in popular tools for phylogenetic analyses. It was proven to be computationally less demanding than more sophisticated estimation methods, e.g. maximum likelihood or Bayesian inference while preserving the statistical consistency and the ability to run with almost any kind of data for which a dissimilarity measure is available. BME can be stated in terms of a nonlinear non-convex combinatorial optimization problem, usually referred to as the Balanced Minimum Evolution Problem (BMEP). Currently, the state-of-the-art among approximate methods for the BMEP is represented by FastME (version 2.0), a software which implements several deterministic phylogenetic construction heuristics combined with a local search on specific neighbourhoods derived by classical topological tree rearrangements. These combinations, however, may not guarantee convergence to close-to-optimal solutions to the problem due to the lack of solution space exploration, a phenomenon which is exacerbated when tackling molecular datasets characterized by a large number of taxa.
To overcome such convergence issues, in this article, we propose a novel metaheuristic, named PhyloES, which exploits the combination of an exploration phase based on Evolution Strategies, a special type of evolutionary algorithm, with a refinement phase based on two local search algorithms. Extensive computational experiments show that PhyloES consistently outperforms FastME, especially when tackling larger datasets, providing solutions characterized by a shorter tree length but also significantly different from the topological perspective.
The software and the data are available at https://github.com/andygaspar/PHYLOES.
平衡最小进化(BME)是一种强大的基于距离的系统发育估计模型,由 Desper 和 Gascuel 引入,现在已被用于系统发育分析的流行工具中。它被证明在计算上比更复杂的估计方法(如最大似然或贝叶斯推断)要求更低,同时保持了统计一致性和能够处理几乎任何可用距离测量的数据集的能力。BME 可以用非线性非凸组合优化问题来表示,通常称为平衡最小进化问题(BMEP)。目前,BMEP 的近似方法中的最新技术是 FastME(版本 2.0),这是一个软件,它实现了几种确定性系统发育构建启发式方法,并结合了通过经典拓扑树重排得到的特定邻域的局部搜索。然而,由于缺乏对解空间的探索,这些组合可能无法保证收敛到接近最优的问题解决方案,这种现象在处理具有大量分类群的分子数据集时更为严重。
为了解决这些收敛问题,在本文中,我们提出了一种新的元启发式算法,名为 PhyloES,它利用基于进化策略的探索阶段与基于两种局部搜索算法的细化阶段相结合。广泛的计算实验表明,PhyloES 始终优于 FastME,尤其是在处理更大的数据集时,提供了具有更短树长的解决方案,但从拓扑角度来看也有很大的不同。
软件和数据可在 https://github.com/andygaspar/PHYLOES 上获得。