Computational Evolution Group, Department of Biosystems Science and Engineering, ETH Zürich, 4058 Basel, Switzerland;
Computational Evolution Group, Swiss Institute of Bioinformatics (SIB), 4058 Basel, Switzerland.
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16921-16926. doi: 10.1073/pnas.1813823116. Epub 2019 Aug 2.
Phylogenetic comparative methods are widely used to understand and quantify the evolution of phenotypic traits, based on phylogenetic trees and trait measurements of extant species. Such analyses depend crucially on the underlying model. Gaussian phylogenetic models like Brownian motion and Ornstein-Uhlenbeck processes are the workhorses of modeling continuous-trait evolution. However, these models fit poorly to big trees, because they neglect the heterogeneity of the evolutionary process in different lineages of the tree. Previous works have addressed this issue by introducing shifts in the evolutionary model occurring at inferred points in the tree. However, for computational reasons, in all current implementations, these shifts are "intramodel," meaning that they allow jumps in 1 or 2 model parameters, keeping all other parameters "global" for the entire tree. There is no biological reason to restrict a shift to a single model parameter or, even, to a single type of model. Mixed Gaussian phylogenetic models (MGPMs) incorporate the idea of jointly inferring different types of Gaussian models associated with different parts of the tree. Here, we propose an approximate maximum-likelihood method for fitting MGPMs to comparative data comprising possibly incomplete measurements for several traits from extant and extinct phylogenetically linked species. We applied the method to the largest published tree of mammal species with body- and brain-mass measurements, showing strong statistical support for an MGPM with 12 distinct evolutionary regimes. Based on this result, we state a hypothesis for the evolution of the brain-body-mass allometry over the past 160 million y.
系统发育比较方法被广泛用于基于系统发育树和现存物种的特征测量来理解和量化表型特征的进化。这种分析严重依赖于基础模型。高斯系统发育模型,如布朗运动和奥恩斯坦-乌伦贝克过程,是建模连续特征进化的主力。然而,这些模型对大树的拟合效果不佳,因为它们忽略了树中不同谱系进化过程的异质性。以前的工作通过在树中推断的点引入进化模型的变化来解决这个问题。然而,出于计算原因,在所有当前的实现中,这些变化是“模型内”的,这意味着它们允许在 1 或 2 个模型参数中跳跃,而保持整个树的所有其他参数“全局”。没有生物学理由将转变限制在单个模型参数或甚至限制在单个模型类型上。混合高斯系统发育模型 (MGPM) 结合了联合推断与树的不同部分相关的不同类型的高斯模型的思想。在这里,我们提出了一种近似最大似然方法,用于将 MGPM 拟合到比较数据中,这些数据可能包含来自现存和已灭绝的系统发育相关物种的几个特征的不完整测量值。我们将该方法应用于哺乳动物物种的最大已发表树上,这些物种具有身体和大脑质量的测量值,该方法为具有 12 个不同进化状态的 MGPM 提供了强有力的统计支持。基于这一结果,我们提出了一个关于过去 1.6 亿年来大脑-身体质量比例进化的假设。