Department of Life and Environmental Sciences, University of California, Merced, CA 95343, USA.
Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, 446 Hesler Biology Building, Knoxville, TN 37996, USA.
Syst Biol. 2024 Jul 27;73(2):470-485. doi: 10.1093/sysbio/syae015.
Chronograms-phylogenies with branch lengths proportional to time-represent key data on timing of evolutionary events, allowing us to study natural processes in many areas of biological research. Chronograms also provide valuable information that can be used for education, science communication, and conservation policy decisions. Yet, achieving a high-quality reconstruction of a chronogram is a difficult and resource-consuming task. Here we present DateLife, a phylogenetic software implemented as an R package and an R Shiny web application available at www.datelife.org, that provides services for efficient and easy discovery, summary, reuse, and reanalysis of node age data mined from a curated database of expert, peer-reviewed, and openly available chronograms. The main DateLife workflow starts with one or more scientific taxon names provided by a user. Names are processed and standardized to a unified taxonomy, allowing DateLife to run a name match across its local chronogram database that is curated from Open Tree of Life's phylogenetic repository, and extract all chronograms that contain at least two queried taxon names, along with their metadata. Finally, node ages from matching chronograms are mapped using the congruification algorithm to corresponding nodes on a tree topology, either extracted from Open Tree of Life's synthetic phylogeny or one provided by the user. Congruified node ages are used as secondary calibrations to date the chosen topology, with or without initial branch lengths, using different phylogenetic dating methods such as BLADJ, treePL, PATHd8, and MrBayes. We performed a cross-validation test to compare node ages resulting from a DateLife analysis (i.e, phylogenetic dating using secondary calibrations) to those from the original chronograms (i.e, obtained with primary calibrations), and found that DateLife's node age estimates are consistent with the age estimates from the original chronograms, with the largest variation in ages occurring around topologically deeper nodes. Because the results from any software for scientific analysis can only be as good as the data used as input, we highlight the importance of considering the results of a DateLife analysis in the context of the input chronograms. DateLife can help to increase awareness of the existing disparities among alternative hypotheses of dates for the same diversification events, and to support exploration of the effect of alternative chronogram hypotheses on downstream analyses, providing a framework for a more informed interpretation of evolutionary results.
时间图表-分支长度与时间成正比的系统发育-代表了进化事件时间的关键数据,使我们能够在生物研究的许多领域研究自然过程。时间图表还提供了有价值的信息,可用于教育、科学传播和保护政策决策。然而,实现高质量的时间图表重建是一项困难且资源密集型的任务。在这里,我们介绍 DateLife,这是一个作为 R 包实现的系统发育软件和一个可在 www.datelife.org 获得的 R Shiny 网络应用程序,它提供了从经过精心整理的专家审查和公开可用的时间图表数据库中挖掘出的节点年龄数据的高效、轻松的发现、总结、重用和再分析服务。DateLife 的主要工作流程从用户提供的一个或多个科学分类群名称开始。名称经过处理和标准化为统一的分类,允许 DateLife 在其本地时间图表数据库中运行名称匹配,该数据库是从 Open Tree of Life 的系统发育存储库中整理出来的,并提取包含至少两个查询分类群名称的所有时间图表,以及它们的元数据。最后,使用一致化算法将来自匹配时间图表的节点年龄映射到从 Open Tree of Life 的综合系统发育中提取的树拓扑或用户提供的树拓扑上的相应节点。使用不同的系统发育日期方法(例如 BLADJ、treePL、PATHd8 和 MrBayes),使用一致化的节点年龄作为二级校准来对选择的拓扑进行日期标注,包括或不包括初始分支长度。我们进行了交叉验证测试,比较了 DateLife 分析(即使用二级校准的系统发育日期)产生的节点年龄与原始时间图表(即使用初级校准获得的节点年龄)的节点年龄,结果发现 DateLife 的节点年龄估计与原始时间图表的年龄估计一致,拓扑上较深的节点年龄变化最大。由于任何科学分析软件的结果只能与输入数据一样好,因此我们强调了在输入时间图表的上下文中考虑 DateLife 分析结果的重要性。DateLife 可以帮助提高对同一多样化事件的替代日期假设之间存在差异的认识,并支持对替代时间图表假设对下游分析的影响的探索,为更明智地解释进化结果提供了一个框架。