Lozano Jenniffer Roa, DeGiorgio Michael, Assis Raquel, Adams Rich
bioRxiv. 2025 Jun 13:2025.06.12.659377. doi: 10.1101/2025.06.12.659377.
A central challenge in comparative biology is linking present-day trait variation across species with unobserved evolutionary processes that occurred in the past. In this endeavor, phylogenetic comparative methods are invaluable for fitting, comparing, and selecting evolutionary models of varying complexity and biological meaning. Traditionally, evolutionary studies have relied on conventional statistical approaches to assess model fit and identify the one that best explains variation in a given trait. Here we explore an alternative strategy by applying supervised learning to predict evolutionary models via discriminant analysis. We formally introduce Evolutionary Discriminant Analysis (EvoDA) as an addition to the biologist's toolkit, offering a suite of new methods for studying trait evolution. We evaluate the performance of EvoDA alongside conventional model selection through a series of fungal phylogeny case studies, each targeting increasingly challenging analytical tasks. These results showcase the strengths of EvoDA, with substantial improvements over conventional approaches when studying traits subject to measurement error, which likely reflect realistic conditions in empirical datasets. To complement our simulation-based benchmarking, we explore the application of EvoDA for tackling a notoriously difficult task: predicting the mode and tempo of gene expression evolution. This empirical analysis suggests that stabilizing selection acts on a majority of genes, with bursts of expression evolution in a handful of genes related to stress, cellular transportation, and transcription regulation. Collectively, our findings illustrate the promise of EvoDA for predicting trait models across a range of evolutionary and experimental contexts, establishing a new methodological framework for the next era of comparative research.
To make sense of biodiversity, evolutionary studies have historically relied on conventional statistical procedures to evaluate competing hypotheses about the mode and tempo of trait evolution. Here, we introduce new supervised learning methods that substantially outperform traditional techniques for correctly assigning trait models across a range of evolutionary and experimental conditions. We find that these methods are highly robust to measurement noise expected from realistic trait data and offer new insights into a central question in comparative genomics: what are the evolutionary forces shaping variation in gene expression across species?
比较生物学中的一个核心挑战是将当前物种间的性状变异与过去发生的未被观察到的进化过程联系起来。在这项工作中,系统发育比较方法对于拟合、比较和选择具有不同复杂性和生物学意义的进化模型非常宝贵。传统上,进化研究依赖于传统统计方法来评估模型拟合度,并确定最能解释给定性状变异的模型。在这里,我们探索一种替代策略,即通过监督学习应用判别分析来预测进化模型。我们正式引入进化判别分析(EvoDA)作为生物学家工具包的补充,提供一套研究性状进化的新方法。我们通过一系列真菌系统发育案例研究评估了EvoDA与传统模型选择的性能,每个案例研究针对越来越具有挑战性的分析任务。这些结果展示了EvoDA的优势,在研究受测量误差影响的性状时,与传统方法相比有显著改进,而测量误差可能反映了经验数据集中的现实情况。为了补充我们基于模拟的基准测试,我们探索了EvoDA在解决一项极其困难的任务中的应用:预测基因表达进化的模式和速度。这项实证分析表明,稳定选择作用于大多数基因,少数与应激、细胞运输和转录调控相关的基因存在表达进化的爆发。总体而言,我们的发现说明了EvoDA在一系列进化和实验背景下预测性状模型的前景,为比较研究的新时代建立了一个新的方法框架。
为了理解生物多样性,进化研究历来依赖传统统计程序来评估关于性状进化模式和速度的相互竞争的假设。在这里,我们引入了新的监督学习方法,在一系列进化和实验条件下正确分配性状模型方面,这些方法大大优于传统技术。我们发现这些方法对现实性状数据预期的测量噪声具有高度鲁棒性,并为比较基因组学中的一个核心问题提供了新的见解:塑造物种间基因表达变异的进化力量是什么?