Landis Michael J, Thompson Ammon
Department of Biology, Washington University, St. Louis, MO, 63110, USA.
Participant in an education program sponsored by U.S. Department of Defense (DOD).
Syst Biol. 2025 May 14. doi: 10.1093/sysbio/syaf036.
Phylogenies contain a wealth of information about the evolutionary history and process that gave rise to the diversity of life. This information can be extracted by fitting phylogenetic models to trees. However, many realistic phylogenetic models lack tractable likelihood functions, prohibiting their use with standard inference methods. We present phyddle, pipeline-based software for performing phylogenetic modeling tasks on trees using likelihood-free deep learning approaches. phyddle has a flexible command-line interface, making it easy to integrate deep learning approaches for phylogenetics into research workflows. phyddle coordinates modeling tasks through five pipeline analysis steps (Simulate, Format, Train, Estimate, and Plot) that transform raw phylogenetic datasets as input into numerical and visual model-based output. We conduct three experiments to compare the accuracy of likelihood-based inferences against deep learning-based inferences obtained through phyddle. Benchmarks show that phyddle accurately performs the inference tasks for which it was designed, such as estimating macroevolutionary parameters, selecting among continuous trait evolution models, and passing coverage tests for epidemiological models, even for models that lack tractable likelihoods. Learn more about phyddle at https://phyddle.org.
系统发育树包含了丰富的信息,这些信息关乎产生生命多样性的进化历史和过程。通过将系统发育模型应用于这些树,可以提取出这些信息。然而,许多现实的系统发育模型缺乏易于处理的似然函数,这使得它们无法与标准的推断方法一起使用。我们展示了phyddle,这是一种基于管道的软件,用于使用无似然深度学习方法对树执行系统发育建模任务。phyddle具有灵活的命令行界面,便于将用于系统发育学的深度学习方法集成到研究工作流程中。phyddle通过五个管道分析步骤(模拟、格式化、训练、估计和绘图)来协调建模任务,这些步骤将原始系统发育数据集作为输入,转换为基于模型的数值和可视化输出。我们进行了三项实验,以比较基于似然的推断与通过phyddle获得的基于深度学习的推断的准确性。基准测试表明,phyddle能够准确地执行其设计的推断任务,例如估计宏观进化参数、在连续性状进化模型中进行选择以及通过流行病学模型的覆盖率测试,即使对于缺乏易于处理的似然性的模型也是如此。在https://phyddle.org上了解有关phyddle的更多信息。