深度时间尺度下的系统发育分析：不可靠的基因树、被忽视的隐藏支持以及合并/串联难题

Gatesy John, Springer Mark S

Department of Biology, University of California, Riverside, CA 92521, USA.

Mol Phylogenet Evol. 2014 Nov;80:231-66. doi: 10.1016/j.ympev.2014.08.013. Epub 2014 Aug 22.

Large datasets are required to solve difficult phylogenetic problems that are deep in the Tree of Life. Currently, two divergent systematic methods are commonly applied to such datasets: the traditional supermatrix approach (= concatenation) and "shortcut" coalescence (= coalescence methods wherein gene trees and the species tree are not co-estimated). When applied to ancient clades, these contrasting frameworks often produce congruent results, but in recent phylogenetic analyses of Placentalia (placental mammals), this is not the case. A recent series of papers has alternatively disputed and defended the utility of shortcut coalescence methods at deep phylogenetic scales. Here, we examine this exchange in the context of published phylogenomic data from Mammalia; in particular we explore two critical issues - the delimitation of data partitions ("genes") in coalescence analysis and hidden support that emerges with the combination of such partitions in phylogenetic studies. Hidden support - increased support for a clade in combined analysis of all data partitions relative to the support evident in separate analyses of the various data partitions, is a hallmark of the supermatrix approach and a primary rationale for concatenating all characters into a single matrix. In the most extreme cases of hidden support, relationships that are contradicted by all gene trees are supported when all of the genes are analyzed together. A valid fear is that shortcut coalescence methods might bypass or distort character support that is hidden in individual loci because small gene fragments are analyzed in isolation. Given the extensive systematic database for Mammalia, the assumptions and applicability of shortcut coalescence methods can be assessed with rigor to complement a small but growing body of simulation work that has directly compared these methods to concatenation. We document several remarkable cases of hidden support in both supermatrix and coalescence paradigms and argue that in most instances, the emergent support in the shortcut coalescence analyses is an artifact. By referencing rigorous molecular clock studies of Mammalia, we suggest that inaccurate gene trees that imply unrealistically deep coalescences debilitate shortcut coalescence analyses of the placental dataset. We document contradictory coalescence results for Placentalia, and outline a critical conundrum that challenges the general utility of shortcut coalescence methods at deep phylogenetic scales. In particular, the basic unit of analysis in coalescence analysis, the coalescence-gene, is expected to shrink in size as more taxa are analyzed, but as the amount of data for reconstruction of a gene tree ratchets downward, the number of nodes in the gene tree that need to be resolved ratchets upward. Some advocates of shortcut coalescence methods have attempted to address problems with inaccurate gene trees by concatenating multiple coalescence-genes to yield "gene trees" that better match the species tree. However, this hybrid concatenation/coalescence approach, "concatalescence," contradicts the most basic biological rationale for performing a coalescence analysis in the first place. We discuss this reality in the context of recent simulation work that also suggests inaccurate reconstruction of gene trees is more problematic for shortcut coalescence methods than deep coalescence of independently segregating loci is for concatenation methods.

解决生命之树深处的复杂系统发育问题需要大量数据集。目前，两种不同的系统发育方法通常应用于此类数据集：传统的超矩阵方法（即串联法）和“捷径”合并法（即基因树和物种树不共同估计的合并方法）。当应用于古老的进化枝时，这些截然不同的框架通常会产生一致的结果，但在胎盘类（胎盘哺乳动物）最近的系统发育分析中，情况并非如此。最近一系列论文对捷径合并法在深度系统发育尺度上的实用性进行了争论和辩护。在这里，我们在已发表的哺乳动物系统发育组学数据的背景下审视这种交流；特别是，我们探讨了两个关键问题——合并分析中数据分区（“基因”）的界定以及系统发育研究中这些分区组合所产生的隐藏支持。隐藏支持——相对于各个数据分区单独分析时明显的支持，在所有数据分区的组合分析中对一个进化枝的支持增加，是超矩阵方法的一个标志，也是将所有特征串联到单个矩阵中的主要理由。在隐藏支持的最极端情况下，当所有基因一起分析时，所有基因树都与之矛盾的关系却得到了支持。一个合理的担忧是，捷径合并法可能会绕过或扭曲单个基因座中隐藏的特征支持，因为小的基因片段是单独分析的。鉴于哺乳动物广泛的系统发育数据库，可以严格评估捷径合并法的假设和适用性，以补充一小部分但不断增长的模拟工作，这些模拟工作直接将这些方法与串联法进行了比较。我们记录了超矩阵和合并范式中几个显著的隐藏支持案例，并认为在大多数情况下，捷径合并分析中出现的支持是一种假象。通过参考对哺乳动物严格的分子钟研究，我们认为暗示不切实际的深度合并的不准确基因树削弱了胎盘数据集的捷径合并分析。我们记录了胎盘类的相互矛盾的合并结果，并概述了一个关键难题，该难题挑战了捷径合并法在深度系统发育尺度上的普遍实用性。特别是，合并分析中的基本分析单位，即合并基因，预计会随着分析的分类单元增多而缩小，但随着用于重建基因树的数据量向下递减，需要解析的基因树中的节点数量却向上递增。一些捷径合并法的支持者试图通过串联多个合并基因来产生与物种树更好匹配的“基因树”，以解决不准确基因树的问题。然而，这种混合的串联/合并方法，即“串并法”，首先与进行合并分析的最基本生物学原理相矛盾。我们在最近的模拟工作背景下讨论了这一现实，该模拟工作还表明，对于捷径合并法来说，基因树的不准确重建比独立分离基因座的深度合并对于串联法来说问题更大。

相似文献

Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

Mol Phylogenet Evol. 2014 Nov;80:231-66. doi: 10.1016/j.ympev.2014.08.013. Epub 2014 Aug 22.

The gene tree delusion.

Mol Phylogenet Evol. 2016 Jan;94(Pt A):1-33. doi: 10.1016/j.ympev.2015.07.018. Epub 2015 Jul 31.

Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts.

Mol Phylogenet Evol. 2019 Oct;139:106539. doi: 10.1016/j.ympev.2019.106539. Epub 2019 Jun 18.

Resolution of a concatenation/coalescence kerfuffle: partitioned coalescence support and a robust family-level tree for Mammalia.

Cladistics. 2017 Jun;33(3):295-332. doi: 10.1111/cla.12170. Epub 2016 Aug 16.

Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms.

Mol Phylogenet Evol. 2015 Oct;91:98-122. doi: 10.1016/j.ympev.2015.05.011. Epub 2015 May 19.

Applying species-tree analyses to deep phylogenetic histories: challenges and potential suggested from a survey of empirical phylogenetic studies.

Mol Phylogenet Evol. 2015 Feb;83:191-9. doi: 10.1016/j.ympev.2014.10.022. Epub 2014 Nov 4.

Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches.

Syst Biol. 2009 Apr;58(2):240-56. doi: 10.1093/sysbio/syp021. Epub 2009 May 30.

Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

Genes (Basel). 2018 Feb 26;9(3):123. doi: 10.3390/genes9030123.

Effectiveness of phylogenomic data and coalescent species-tree methods for resolving difficult nodes in the phylogeny of advanced snakes (Serpentes: Caenophidia).

Mol Phylogenet Evol. 2014 Dec;81:221-31. doi: 10.1016/j.ympev.2014.08.023. Epub 2014 Sep 3.

Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers.

Syst Biol. 2007 Jun;56(3):400-11. doi: 10.1080/10635150701405560.

引用本文的文献

Leveraging Weighted Quartet Distributions for Enhanced Species Tree Inference from Genome-Wide Data.

Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf159.

Concatenation fails to describe the anomalous radiation of giant cockroaches (Blattodea: Blaberidae) despite moderate to low discordance.

BMC Ecol Evol. 2025 Jul 21;25(1):72. doi: 10.1186/s12862-025-02409-4.

Resolving phylogenetic conflicts in Pandanales: the dual roles of gene flow and whole-genome duplication.

Front Plant Sci. 2025 Feb 24;16:1511582. doi: 10.3389/fpls.2025.1511582. eCollection 2025.

MEGA12: Molecular Evolutionary Genetic Analysis Version 12 for Adaptive and Green Computing.

Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae263.

The Meaning and Measure of Concordance Factors in Phylogenomics.

Mol Biol Evol. 2024 Nov 1;41(11). doi: 10.1093/molbev/msae214.

Phylogenomic Discordance is Driven by Wide-Spread Introgression and Incomplete Lineage Sorting During Rapid Species Diversification Within Rattlesnakes (Viperidae: Crotalus and Sistrurus).

Syst Biol. 2024 Oct 25;73(4):722-741. doi: 10.1093/sysbio/syae018.

Phylogenomics of Neogastropoda: The Backbone Hidden in the Bush.

Syst Biol. 2024 Sep 5;73(3):521-531. doi: 10.1093/sysbio/syae010.

MAST: Phylogenetic Inference with Mixtures Across Sites and Trees.

Syst Biol. 2024 Jul 27;73(2):375-391. doi: 10.1093/sysbio/syae008.

Dynamic evolution of size and colour in the highly specialized ant-eating spiders.

Proc Biol Sci. 2023 Aug 9;290(2004):20230797. doi: 10.1098/rspb.2023.0797.

Confusion will be my epitaph: genome-scale discordance stifles phylogenetic resolution of Holothuroidea.

Proc Biol Sci. 2023 Jul 12;290(2002):20230988. doi: 10.1098/rspb.2023.0988.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

Mol Phylogenet Evol. 2014 Nov;80:231-66. doi: 10.1016/j.ympev.2014.08.013. Epub 2014 Aug 22.

The gene tree delusion.

Mol Phylogenet Evol. 2016 Jan;94(Pt A):1-33. doi: 10.1016/j.ympev.2015.07.018. Epub 2015 Jul 31.

Partitioned coalescence support reveals biases in species-tree methods and detects gene trees that determine phylogenomic conflicts.

Mol Phylogenet Evol. 2019 Oct;139:106539. doi: 10.1016/j.ympev.2019.106539. Epub 2019 Jun 18.

Resolution of a concatenation/coalescence kerfuffle: partitioned coalescence support and a robust family-level tree for Mammalia.

Cladistics. 2017 Jun;33(3):295-332. doi: 10.1111/cla.12170. Epub 2016 Aug 16.

Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms.

Mol Phylogenet Evol. 2015 Oct;91:98-122. doi: 10.1016/j.ympev.2015.05.011. Epub 2015 May 19.

Applying species-tree analyses to deep phylogenetic histories: challenges and potential suggested from a survey of empirical phylogenetic studies.

Mol Phylogenet Evol. 2015 Feb;83:191-9. doi: 10.1016/j.ympev.2014.10.022. Epub 2014 Nov 4.

Complete generic-level phylogenetic analyses of palms (Arecaceae) with comparisons of supertree and supermatrix approaches.

Syst Biol. 2009 Apr;58(2):240-56. doi: 10.1093/sysbio/syp021. Epub 2009 May 30.

Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

Genes (Basel). 2018 Feb 26;9(3):123. doi: 10.3390/genes9030123.

Effectiveness of phylogenomic data and coalescent species-tree methods for resolving difficult nodes in the phylogeny of advanced snakes (Serpentes: Caenophidia).

Mol Phylogenet Evol. 2014 Dec;81:221-31. doi: 10.1016/j.ympev.2014.08.023. Epub 2014 Sep 3.

Estimating species phylogeny from gene-tree probabilities despite incomplete lineage sorting: an example from Melanoplus grasshoppers.

Syst Biol. 2007 Jun;56(3):400-11. doi: 10.1080/10635150701405560.

引用本文的文献

Leveraging Weighted Quartet Distributions for Enhanced Species Tree Inference from Genome-Wide Data.

Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf159.

Concatenation fails to describe the anomalous radiation of giant cockroaches (Blattodea: Blaberidae) despite moderate to low discordance.

BMC Ecol Evol. 2025 Jul 21;25(1):72. doi: 10.1186/s12862-025-02409-4.

Resolving phylogenetic conflicts in Pandanales: the dual roles of gene flow and whole-genome duplication.

Front Plant Sci. 2025 Feb 24;16:1511582. doi: 10.3389/fpls.2025.1511582. eCollection 2025.

MEGA12: Molecular Evolutionary Genetic Analysis Version 12 for Adaptive and Green Computing.

Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae263.

The Meaning and Measure of Concordance Factors in Phylogenomics.

Mol Biol Evol. 2024 Nov 1;41(11). doi: 10.1093/molbev/msae214.

Phylogenomic Discordance is Driven by Wide-Spread Introgression and Incomplete Lineage Sorting During Rapid Species Diversification Within Rattlesnakes (Viperidae: Crotalus and Sistrurus).

Syst Biol. 2024 Oct 25;73(4):722-741. doi: 10.1093/sysbio/syae018.

Phylogenomics of Neogastropoda: The Backbone Hidden in the Bush.

Syst Biol. 2024 Sep 5;73(3):521-531. doi: 10.1093/sysbio/syae010.

MAST: Phylogenetic Inference with Mixtures Across Sites and Trees.

Syst Biol. 2024 Jul 27;73(2):375-391. doi: 10.1093/sysbio/syae008.

Dynamic evolution of size and colour in the highly specialized ant-eating spiders.

Proc Biol Sci. 2023 Aug 9;290(2004):20230797. doi: 10.1098/rspb.2023.0797.

Confusion will be my epitaph: genome-scale discordance stifles phylogenetic resolution of Holothuroidea.

Proc Biol Sci. 2023 Jul 12;290(2002):20230988. doi: 10.1098/rspb.2023.0988.

Suppr
超能文献

Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

Suppr超能文献

深度时间尺度下的系统发育分析：不可靠的基因树、被忽视的隐藏支持以及合并/串联难题

Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

Suppr
超能文献