Division of Invertebrate Zoology, American Museum of Natural History, 200 Central Park West, New York, NY, 10024, USA.
Cladistics. 2024 Aug;40(4):430-437. doi: 10.1111/cla.12572. Epub 2024 Feb 28.
A phylogenetic graph search relies on a large number of highly parameterized search procedures (e.g. branch-swapping, perturbation, simulated annealing, genetic algorithm). These procedures vary in effectiveness over datasets and at alternative points in analytical pipelines. The multi-armed bandit problem is applied to phylogenetic graph searching to more effectively utilize these procedures. Thompson sampling is applied to a collection of search and optimization "bandits" to favour productive search strategies over those that are less successful. This adaptive random sampling strategy is shown to be more effective in producing heuristically optimal phylogenetic graphs and more time efficient than existing uniform probability randomized search strategies. The strategy acts as a form of unsupervised machine learning that can be applied to a diversity of phylogenetic datasets without prior knowledge of their properties.
一个系统发育图搜索依赖于大量高度参数化的搜索过程(例如,分支交换、扰动、模拟退火、遗传算法)。这些过程在数据集和分析管道的不同点上的有效性不同。多臂赌博机问题被应用于系统发育图搜索,以更有效地利用这些过程。Thompson 抽样被应用于一组搜索和优化“赌博机”,以有利于更成功的搜索策略,而不是不太成功的搜索策略。这种自适应随机抽样策略被证明在生成启发式最优系统发育图方面更有效,并且比现有的均匀概率随机搜索策略更有效率。该策略作为一种无监督机器学习,可以应用于各种系统发育数据集,而无需事先了解其特性。