Suppr
超能文献

多变量相关性状的适应性变化推断。

Inference of Adaptive Shifts for Multivariate Correlated Traits.

机构信息

Unité Mixte de Recherche Mathématiques et Informatique Appliquées (MIA - Paris), AgroParisTech, Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, 16 rue Claude Bernard, 75005 Paris, France.

Unité de Recherche Mathématiques et Informatique Appliquées du Génome à l'Environnement (MaIAGE), Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, Domaine de Vilvert, 78352 Jouy-en-Josas, France.

出版信息

Syst Biol. 2018 Jul 1;67(4):662-680. doi: 10.1093/sysbio/syy005.

DOI:10.1093/sysbio/syy005

PMID:29385556

Abstract

To study the evolution of several quantitative traits, the classical phylogenetic comparative framework consists of a multivariate random process running along the branches of a phylogenetic tree. The Ornstein-Uhlenbeck (OU) process is sometimes preferred to the simple Brownian motion (BM) as it models stabilizing selection toward an optimum. The optimum for each trait is likely to be changing over the long periods of time spanned by large modern phylogenies. Our goal is to automatically detect the position of these shifts on a phylogenetic tree, while accounting for correlations between traits, which might exist because of structural or evolutionary constraints. We show that, in the presence of shifts, phylogenetic Principal Component Analysis fails to decorrelate traits efficiently, so that any method aiming at finding shifts needs to deal with correlation simultaneously. We introduce here a simplification of the full multivariate OU model, named scalar OU, which allows for noncausal correlations and is still computationally tractable. We extend the equivalence between the OU and a BM on a rescaled tree to our multivariate framework. We describe an Expectation-Maximization (EM) algorithm that allows for a maximum likelihood estimation of the shift positions, associated with a new model selection criterion, accounting for the identifiability issues for the shift localization on the tree. The method, freely available as an R-package (PhylogeneticEM) is fast, and can deal with missing values. We demonstrate its efficiency and accuracy compared to another state-of-the-art method ($\ell$1ou) on a wide range of simulated scenarios and use this new framework to reanalyze recently gathered data sets on New World Monkeys and Anolis lizards.

摘要

为了研究几个数量性状的进化，经典的系统发育比较框架包括一个沿着系统发育树分支运行的多元随机过程。与简单的布朗运动（BM）相比，有时更倾向于使用奥尔森-于伦贝克（OU）过程，因为它可以对最优值进行稳定选择的建模。每个性状的最优值可能会随着大型现代系统发育所涵盖的长时间而发生变化。我们的目标是在系统发育树上自动检测这些变化的位置，同时考虑到由于结构或进化约束而可能存在的性状之间的相关性。我们表明，在存在变化的情况下，系统发育主成分分析无法有效地解相关性状，因此，任何旨在发现变化的方法都需要同时处理相关性。我们在这里引入了全多元 OU 模型的简化形式，称为标量 OU，它允许非因果相关性，并且仍然具有计算可处理性。我们将 OU 与缩放树上的 BM 之间的等价关系扩展到我们的多元框架。我们描述了一种期望最大化（EM）算法，该算法允许对移位位置进行最大似然估计，并与新的模型选择标准相关联，该标准考虑了树移位定位的可识别性问题。该方法（以 R 包（PhylogeneticEM）的形式免费提供）速度快，并且可以处理缺失值。我们在广泛的模拟场景中比较了该方法与另一种最先进的方法（$\ell$1ou）的效率和准确性，并使用此新框架重新分析了最近收集的新世界猴子和 Anolis 蜥蜴数据集。