Division of Information System Design, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan.
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR.
J Comput Biol. 2020 Sep;27(9):1422-1432. doi: 10.1089/cmb.2019.0432. Epub 2020 Feb 12.
The problem of computing the rooted subtree prune and regraft (rSPR) distance of two phylogenetic trees is computationally hard and so is the problem of computing the hybridization number of two phylogenetic trees (denoted by Hybridization Number Computation [HNC]). Since they are important problems in phylogenetics, they have been studied extensively in the literature. Indeed, quite a number of exact or approximation algorithms have been designed and implemented for them. In this article, we design and implement several approximation algorithms for them and one exact algorithm for HNC. Our experimental results show that the resulting exact program is much faster (namely, more than 80 times faster for the easiest dataset used in the experiments) than the previous best and its superiority in speed becomes even more significant for more difficult instances. Moreover, the resulting approximation program's output has much better results than the previous bests; indeed, the outputs are always nearly optimal and often optimal. Of particular interest is the usage of the Monte Carlo tree search (MCTS) method in the design of our approximation algorithms. Our experimental results show that with MCTS, we can often solve HNC exactly within short time.
计算两棵系统发生树的有根子树修剪和重新嫁接(rSPR)距离的问题在计算上是困难的,计算两棵系统发生树的杂交数(表示为杂交数计算[HNC])的问题也是如此。由于它们是系统发生学中的重要问题,因此在文献中已经进行了广泛的研究。事实上,已经为它们设计和实现了相当多的精确或近似算法。在本文中,我们为它们设计和实现了几个近似算法,以及一个用于 HNC 的精确算法。我们的实验结果表明,所得到的精确程序比以前的最佳程序快得多(即在实验中使用的最简单数据集上快 80 多倍),并且对于更困难的实例,其速度优势变得更加显著。此外,所得到的近似程序的输出比以前的最佳程序要好得多;实际上,输出总是接近最优的,而且经常是最优的。特别有趣的是在我们的近似算法设计中使用了蒙特卡罗树搜索(MCTS)方法。我们的实验结果表明,使用 MCTS,我们通常可以在短时间内精确地解决 HNC 问题。