Belozersky Institute, Lomonosov Moscow State University, Moscow, Russia.
Higher School of Economics, Moscow, Russia.
Mol Biol Evol. 2024 Jun 1;41(6). doi: 10.1093/molbev/msae084.
Phylogenetic inference based on protein sequence alignment is a widely used procedure. Numerous phylogenetic algorithms have been developed, most of which have many parameters and options. Choosing a program, options, and parameters can be a nontrivial task. No benchmark for comparison of phylogenetic programs on real protein sequences was publicly available. We have developed PhyloBench, a benchmark for evaluating the quality of phylogenetic inference, and used it to test a number of popular phylogenetic programs. PhyloBench is based on natural, not simulated, protein sequences of orthologous evolutionary domains. The measure of accuracy of an inferred tree is its distance to the corresponding species tree. A number of tree-to-tree distance measures were tested. The most reliable results were obtained using the Robinson-Foulds distance. Our results confirmed recent findings that distance methods are more accurate than maximum likelihood (ML) and maximum parsimony. We tested the bayesian program MrBayes on natural protein sequences and found that, on our datasets, it performs better than ML, but worse than distance methods. Of the methods we tested, the Balanced Minimum Evolution method implemented in FastME yielded the best results on our material. Alignments and reference species trees are available at https://mouse.belozersky.msu.ru/tools/phylobench/ together with a web-interface that allows for a semi-automatic comparison of a user's method with a number of popular programs.
基于蛋白质序列比对的系统发育推断是一种广泛使用的方法。已经开发了许多系统发育算法,其中大多数都有许多参数和选项。选择一个程序、选项和参数可能不是一件简单的任务。没有公开的基准来比较真实蛋白质序列上的系统发育程序。我们开发了 PhyloBench,这是一个评估系统发育推断质量的基准,并使用它测试了许多流行的系统发育程序。PhyloBench 基于自然的,而不是模拟的,直系同源进化域的蛋白质序列。推断树的准确性的度量是它与相应的种系发生树的距离。我们测试了许多树到树的距离度量方法。使用 Robinson-Foulds 距离得到了最可靠的结果。我们的结果证实了最近的发现,即距离方法比最大似然(ML)和最大简约法更准确。我们在自然蛋白质序列上测试了贝叶斯程序 MrBayes,并发现,在我们的数据集上,它的性能优于 ML,但逊于距离方法。在我们测试的方法中,FastME 中实现的平衡最小进化方法在我们的材料上产生了最好的结果。对齐和参考种系发生树可在 https://mouse.belozersky.msu.ru/tools/phylobench/ 上获得,并且提供了一个网络界面,允许用户的方法与许多流行的程序进行半自动比较。