Suppr超能文献

过去的时态真复杂:解读系统发育分歧时间估计。

The Past Sure is Tense: On Interpreting Phylogenetic Divergence Time Estimates.

机构信息

Department of Ecology & Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109, USA.

出版信息

Syst Biol. 2018 Mar 1;67(2):340-353. doi: 10.1093/sysbio/syx074.

Abstract

Divergence time estimation-the calibration of a phylogeny to geological time-is an integral first step in modeling the tempo of biological evolution (traits and lineages). However, despite increasingly sophisticated methods to infer divergence times from molecular genetic sequences, the estimated age of many nodes across the tree of life contrast significantly and consistently with timeframes conveyed by the fossil record. This is perhaps best exemplified by crown angiosperms, where molecular clock (Triassic) estimates predate the oldest (Early Cretaceous) undisputed angiosperm fossils by tens of millions of years or more. While the incompleteness of the fossil record is a common concern, issues of data limitation and model inadequacy are viable (if underexplored) alternative explanations. In this vein, Beaulieu et al. (2015) convincingly demonstrated how methods of divergence time inference can be misled by both (i) extreme state-dependent molecular substitution rate heterogeneity and (ii) biased sampling of representative major lineages. These results demonstrate the impact of (potentially common) model violations. Here, we suggest another potential challenge: that the configuration of the statistical inference problem (i.e., the parameters, their relationships, and associated priors) alone may preclude the reconstruction of the paleontological timeframe for the crown age of angiosperms. We demonstrate, through sampling from the joint prior (formed by combining the tree (diversification) prior with the calibration densities specified for fossil-calibrated nodes) that with no data present at all, that an Early Cretaceous crown angiosperms is rejected (i.e., has essentially zero probability). More worrisome, however, is that for the 24 nodes calibrated by fossils, almost all have indistinguishable marginal prior and posterior age distributions when employing routine lognormal fossil calibration priors. These results indicate that there is inadequate information in the data to over-rule the joint prior. Given that these calibrated nodes are strategically placed in disparate regions of the tree, they act to anchor the tree scaffold, and so the posterior inference for the tree as a whole is largely determined by the pseudodata present in the (often arbitrary) calibration densities. We recommend, as for any Bayesian analysis, that marginal prior and posterior distributions be carefully compared to determine whether signal is coming from the data or prior belief, especially for parameters of direct interest. This recommendation is not novel. However, given how rarely such checks are carried out in evolutionary biology, it bears repeating. Our results demonstrate the fundamental importance of prior/posterior comparisons in any Bayesian analysis, and we hope that they further encourage both researchers and journals to consistently adopt this crucial step as standard practice. Finally, we note that the results presented here do not refute the biological modeling concerns identified by Beaulieu et al. (2015). Both sets of issues remain apposite to the goals of accurate divergence time estimation, and only by considering them in tandem can we move forward more confidently.

摘要

分歧时间估计——将系统发育校准到地质时间——是对生物进化(特征和谱系)进行建模的一个基本的第一步。然而,尽管有越来越复杂的方法可以从分子遗传序列中推断分歧时间,但生命之树中许多节点的估计年龄与化石记录所传达的时间框架有很大的差异,而且一直如此。这在冠群被子植物中表现得最为明显,分子钟(三叠纪)的估计值比最早的(早白垩世)无争议的被子植物化石早了数千万年甚至更长时间。虽然化石记录的不完整性是一个常见的问题,但数据限制和模型不足等问题也是可行的(尽管尚未得到充分探索)替代解释。在这方面,Beaulieu 等人(2015 年)令人信服地证明了分歧时间推断方法如何会受到以下两种情况的误导:(i)极端状态依赖的分子替代率异质性,以及(ii)代表主要谱系的有偏采样。这些结果表明了(潜在的常见)模型违反的影响。在这里,我们提出了另一个潜在的挑战:统计推断问题的配置(即参数、它们之间的关系以及相关的先验)本身可能会阻止重建冠群被子植物的古生物学时间框架。我们通过从联合先验中采样(通过将树(多样化)先验与为化石校准的节点指定的校准密度相结合形成)证明,在没有任何数据的情况下,早白垩世的冠群被子植物被拒绝(即,几乎没有概率)。然而,更令人担忧的是,对于 24 个通过化石校准的节点,当使用常规的对数正态化石校准先验时,几乎所有节点的边际先验和后验年龄分布都没有区别。这些结果表明,数据中没有足够的信息来推翻联合先验。鉴于这些经过校准的节点战略性地位于树的不同区域,它们起到了固定树支架的作用,因此整个树的后验推断在很大程度上取决于(通常是任意的)校准密度中的伪数据。鉴于任何贝叶斯分析,我们建议仔细比较边际先验和后验分布,以确定信号是来自数据还是先验信念,特别是对于直接感兴趣的参数。这一建议并不新颖。然而,鉴于进化生物学中很少进行这样的检查,因此值得重复。我们的结果证明了在任何贝叶斯分析中进行先验/后验比较的基本重要性,我们希望它们能进一步鼓励研究人员和期刊一致地将这一关键步骤作为标准做法采用。最后,我们注意到,这里提出的结果并没有反驳 Beaulieu 等人(2015 年)提出的生物学建模问题。这两组问题仍然与准确的分歧时间估计目标相关,只有同时考虑这两组问题,我们才能更有信心地向前推进。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验