相对简单和复杂替代模型在系统基因组学估计分歧时间的效率。

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics.

机构信息

Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA.

Department of Biology, Temple University, Philadelphia, PA.

出版信息

Mol Biol Evol. 2020 Jun 1;37(6):1819-1831. doi: 10.1093/molbev/msaa049.

DOI:10.1093/molbev/msaa049

PMID:32119075

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7253201/

Abstract

The conventional wisdom in molecular evolution is to apply parameter-rich models of nucleotide and amino acid substitutions for estimating divergence times. However, the actual extent of the difference between time estimates produced by highly complex models compared with those from simple models is yet to be quantified for contemporary data sets that frequently contain sequences from many species and genes. In a reanalysis of many large multispecies alignments from diverse groups of taxa, we found that the use of the simplest models can produce divergence time estimates and credibility intervals similar to those obtained from the complex models applied in the original studies. This result is surprising because the use of simple models underestimates sequence divergence for all the data sets analyzed. We found three fundamental reasons for the observed robustness of time estimates to model complexity in many practical data sets. First, the estimates of branch lengths and node-to-tip distances under the simplest model show an approximately linear relationship with those produced by using the most complex models applied on data sets with many sequences. Second, relaxed clock methods automatically adjust rates on branches that experience considerable underestimation of sequence divergences, resulting in time estimates that are similar to those from complex models. And, third, the inclusion of even a few good calibrations in an analysis can reduce the difference in time estimates from simple and complex models. The robustness of time estimates to model complexity in these empirical data analyses is encouraging, because all phylogenomics studies use statistical models that are oversimplified descriptions of actual evolutionary substitution processes.

摘要

分子进化的传统观点是应用核苷酸和氨基酸替换的参数丰富模型来估计分歧时间。然而，对于经常包含来自许多物种和基因的序列的当代数据集，尚未对高度复杂模型与简单模型产生的时间估计值之间的实际差异程度进行量化。在对来自不同分类群的许多大型多物种排列的重新分析中，我们发现使用最简单的模型可以产生与原始研究中应用的复杂模型获得的分歧时间估计值和置信区间相似的结果。由于简单模型低估了所有分析数据集的序列分歧，因此该结果令人惊讶。我们发现，在许多实际数据集模型复杂性中，时间估计值的稳健性存在三个基本原因。首先，最简单模型下的分支长度和节点到尖端距离的估计值与在具有许多序列的数据集上应用的最复杂模型产生的估计值之间呈近似线性关系。其次，松弛时钟方法自动调整经历序列分歧低估的分支上的速率，从而产生与复杂模型相似的时间估计值。第三，在分析中包含即使只有几个良好的校准点也可以减少简单和复杂模型之间的时间估计值差异。这些经验数据分析中时间估计值对模型复杂性的稳健性令人鼓舞，因为所有基因组学研究都使用简化的统计模型，这些模型是对实际进化替代过程的过度简化描述。

相似文献

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics.相对简单和复杂替代模型在系统基因组学估计分歧时间的效率。

Mol Biol Evol. 2020 Jun 1;37(6):1819-1831. doi: 10.1093/molbev/msaa049.

Performance of relaxed-clock methods in estimating evolutionary divergence times and their credibility intervals.松弛时钟方法在估计进化分歧时间及其置信区间方面的性能。

Mol Biol Evol. 2010 Jun;27(6):1289-300. doi: 10.1093/molbev/msq014. Epub 2010 Jan 21.

Assessing Rapid Relaxed-Clock Methods for Phylogenomic Dating.评估系统发生基因组学日期推断的快速松弛时钟方法。

Genome Biol Evol. 2021 Nov 5;13(11). doi: 10.1093/gbe/evab251.

Characterization of the uncertainty of divergence time estimation under relaxed molecular clock models using multiple loci.利用多个基因座在宽松分子钟模型下对分歧时间估计的不确定性进行表征。

Syst Biol. 2015 Mar;64(2):267-80. doi: 10.1093/sysbio/syu109. Epub 2014 Dec 11.

Using a GTR+Γ substitution model for dating sequence divergence when stationarity and time-reversibility assumptions are violated.当稳定性和时间可逆性假设不成立时，使用 GTR+Γ 替代模型来估计序列分歧的时间。

Bioinformatics. 2020 Dec 30;36(Suppl_2):i884-i894. doi: 10.1093/bioinformatics/btaa820.

Branch length estimation and divergence dating: estimates of error in Bayesian and maximum likelihood frameworks.支长估计和分歧日期：贝叶斯和最大似然框架中的误差估计。

BMC Evol Biol. 2010 Jan 11;10:5. doi: 10.1186/1471-2148-10-5.

Molecular dating for phylogenies containing a mix of populations and species by using Bayesian and RelTime approaches.利用贝叶斯和 RelTime 方法对包含种群和物种混合的系统发育进行分子定年。

Mol Ecol Resour. 2021 Jan;21(1):122-136. doi: 10.1111/1755-0998.13249. Epub 2020 Sep 16.

Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds.使用具有软边界的多个化石校准，在分子钟下对物种分化时间进行贝叶斯估计。

Mol Biol Evol. 2006 Jan;23(1):212-26. doi: 10.1093/molbev/msj024. Epub 2005 Sep 21.

RelTime Relaxes the Strict Molecular Clock throughout the Phylogeny.RelTime 使整个系统发育中的严格分子钟放松。

Genome Biol Evol. 2018 Jun 1;10(6):1631-1636. doi: 10.1093/gbe/evy118.

Estimating divergence times in large molecular phylogenies.估计大型分子系统发育中的分歧时间。

Proc Natl Acad Sci U S A. 2012 Nov 20;109(47):19333-8. doi: 10.1073/pnas.1213199109. Epub 2012 Nov 5.

引用本文的文献

The impact of software and criteria on the selection of best-fit nucleotide substitution models for molecular evolutionary genetic analysis.软件和标准对分子进化遗传分析中最佳拟合核苷酸替换模型选择的影响。

PLoS One. 2025 Mar 26;20(3):e0319774. doi: 10.1371/journal.pone.0319774. eCollection 2025.

Challenges in Assembling the Dated Tree of Life.组装有年代的生命之树的挑战。

Genome Biol Evol. 2024 Oct 9;16(10). doi: 10.1093/gbe/evae229.

Modeling Substitution Rate Evolution across Lineages and Relaxing the Molecular Clock.对谱系间替代率演变进行建模和放松分子钟假说。

Genome Biol Evol. 2024 Sep 3;16(9). doi: 10.1093/gbe/evae199.

Extant Sequence Reconstruction: The Accuracy of Ancestral Sequence Reconstructions Evaluated by Extant Sequence Cross-Validation.现存序列重建：通过现存序列交叉验证评估祖先序列重建的准确性。

J Mol Evol. 2024 Apr;92(2):181-206. doi: 10.1007/s00239-024-10162-3. Epub 2024 Mar 19.

Ancestral sequence reconstruction as a tool to study the evolution of wood decaying fungi.祖先序列重建作为研究木材腐朽真菌进化的一种工具。

Front Fungal Biol. 2022 Oct 14;3:1003489. doi: 10.3389/ffunb.2022.1003489. eCollection 2022.

Assessing the relative performance of fast molecular dating methods for phylogenomic data.评估系统发生基因组数据快速分子定年方法的相对性能。

BMC Genomics. 2022 Dec 3;23(1):798. doi: 10.1186/s12864-022-09030-5.

Methodologies for Microbial Ancestral Sequence Reconstruction.微生物祖先序列重建方法。

Methods Mol Biol. 2022;2569:283-303. doi: 10.1007/978-1-0716-2691-7_14.

Consequences of Substitution Model Selection on Protein Ancestral Sequence Reconstruction.替代模型选择对蛋白质祖先序列重建的影响。

Mol Biol Evol. 2022 Jul 2;39(7). doi: 10.1093/molbev/msac144.

Phylogenomic analyses of echinoid diversification prompt a re-evaluation of their fossil record.系统发生基因组分析促使人们重新评估海胆类的化石记录。

Elife. 2022 Mar 22;11:e72460. doi: 10.7554/eLife.72460.

Data-driven speciation tree prior for better species divergence times in calibration-poor molecular phylogenies.基于数据驱动的物种形成树先验模型，可改善校准不足的分子系统发育中物种分歧时间的估计。

Bioinformatics. 2021 Jul 12;37(Suppl_1):i102-i110. doi: 10.1093/bioinformatics/btab307.

本文引用的文献

Relative Model Fit Does Not Predict Topological Accuracy in Single-Gene Protein Phylogenetics.相对模型拟合度不能预测单基因蛋白质系统发生的拓扑准确性。

Mol Biol Evol. 2020 Jul 1;37(7):2110-2123. doi: 10.1093/molbev/msaa075.

Reliable Confidence Intervals for RelTime Estimates of Evolutionary Divergence Times.可靠的进化分歧时间 RelTime 估计置信区间。

Mol Biol Evol. 2020 Jan 1;37(1):280-290. doi: 10.1093/molbev/msz236.

Origin of angiosperms and the puzzle of the Jurassic gap.被子植物的起源和侏罗纪间断的谜题。

Nat Plants. 2019 May;5(5):461-470. doi: 10.1038/s41477-019-0421-0. Epub 2019 May 6.

Earth history and the passerine superradiation.地球历史与雀形目鸟类的超级辐射。

Proc Natl Acad Sci U S A. 2019 Apr 16;116(16):7916-7925. doi: 10.1073/pnas.1813206116. Epub 2019 Apr 1.

Model selection may not be a mandatory step for phylogeny reconstruction.模型选择可能不是系统发育重建的强制性步骤。

Nat Commun. 2019 Feb 25;10(1):934. doi: 10.1038/s41467-019-08822-w.

Integrated genomic and fossil evidence illuminates life's early evolution and eukaryote origin.综合基因组和化石证据揭示了生命早期的进化和真核生物的起源。

Nat Ecol Evol. 2018 Oct;2(10):1556-1562. doi: 10.1038/s41559-018-0644-x. Epub 2018 Aug 20.

Optimal Rates for Phylogenetic Inference and Experimental Design in the Era of Genome-Scale Data Sets.基因组规模数据集时代的系统发育推断和实验设计的最佳速率。

Syst Biol. 2019 Jan 1;68(1):145-156. doi: 10.1093/sysbio/syy047.

Relative Evolutionary Rates in Proteins Are Largely Insensitive to the Substitution Model.蛋白质的相对进化率在很大程度上不受替换模型的影响。

Mol Biol Evol. 2018 Sep 1;35(9):2307-2317. doi: 10.1093/molbev/msy127.

Theoretical Foundation of the RelTime Method for Estimating Divergence Times from Variable Evolutionary Rates.RelTime 方法估计具有可变进化率的分歧时间的理论基础。

Mol Biol Evol. 2018 Jul 1;35(7):1770-1782. doi: 10.1093/molbev/msy044.

MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.MEGA X：跨越计算平台的分子进化遗传学分析。

Mol Biol Evol. 2018 Jun 1;35(6):1547-1549. doi: 10.1093/molbev/msy096.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

相对简单和复杂替代模型在系统基因组学估计分歧时间的效率。

Relative Efficiencies of Simple and Complex Substitution Models in Estimating Divergence Times in Phylogenomics.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献