Synthetic and Systems Biology Unit, Institute of Biochemistry, BRC-HAS, Szeged 6726, Hungary.
Institute of Archaeology, Research Centre for the Humanities, Hungarian Academy of Sciences, Budapest 1097, Hungary.
Syst Biol. 2020 Jan 1;69(1):17-37. doi: 10.1093/sysbio/syz029.
Resolving deep divergences in the tree of life is challenging even for analyses of genome-scale phylogenetic data sets. Relationships between Basidiomycota subphyla, the rusts and allies (Pucciniomycotina), smuts and allies (Ustilaginomycotina), and mushroom-forming fungi and allies (Agaricomycotina) were found particularly recalcitrant both to traditional multigene and genome-scale phylogenetics. Here, we address basal Basidiomycota relationships using concatenated and gene tree-based analyses of various phylogenomic data sets to examine the contribution of several potential sources of bias. We evaluate the contribution of biological causes (hard polytomy, incomplete lineage sorting) versus unmodeled evolutionary processes and factors that exacerbate their effects (e.g., fast-evolving sites and long-branch taxa) to inferences of basal Basidiomycota relationships. Bayesian Markov Chain Monte Carlo and likelihood mapping analyses reject the hard polytomy with confidence. In concatenated analyses, fast-evolving sites and oversimplified models of amino acid substitution favored the grouping of smuts with mushroom-forming fungi, often leading to maximal bootstrap support in both concatenation and coalescent analyses. On the contrary, the most conserved data subsets grouped rusts and allies with mushroom-forming fungi, although this relationship proved labile, sensitive to model choice, to different data subsets and to missing data. Excluding putative long-branch taxa, genes with high proportions of missing data and/or with strong signal failed to reveal a consistent trend toward one or the other topology, suggesting that additional sources of conflict are at play. While concatenated analyses yielded strong but conflicting support, individual gene trees mostly provided poor support for any resolution of rusts, smuts, and mushroom-forming fungi, suggesting that the true Basidiomycota tree might be in a part of tree space that is difficult to access using both concatenation and gene tree-based approaches. Inference-based assessments of absolute model fit strongly reject best-fit models for the vast majority of genes, indicating a poor fit of even the most commonly used models. While this is consistent with previous assessments of site-homogenous models of amino acid evolution, this does not appear to be the sole source of confounding signal. Our analyses suggest that topologies uniting smuts with mushroom-forming fungi can arise as a result of inappropriate modeling of amino acid sites that might be prone to systematic bias. We speculate that improved models of sequence evolution could shed more light on basal splits in the Basidiomycota, which, for now, remain unresolved despite the use of whole genome data.
解决生命之树中深层分歧即使对于全基因组系统发育数据集的分析也是具有挑战性的。担子菌亚门、锈菌和相关类群(锈菌目)、黑粉菌和相关类群(黑粉菌目)以及形成蘑菇的真菌和相关类群(伞菌目)之间的关系特别难以确定,这既不符合传统的多基因分析,也不符合全基因组系统发育分析。在这里,我们使用各种基因组数据集的串联和基于基因树的分析来解决基础担子菌的关系,以检验几种潜在的偏倚来源的贡献。我们评估了生物原因(硬多系性、不完全谱系分选)与未建模的进化过程和加剧其影响的因素(例如,快速进化的位点和长支分类群)对基础担子菌关系推断的贡献。贝叶斯马尔可夫链蒙特卡罗和似然映射分析有信心地拒绝了硬多系性。在串联分析中,快速进化的位点和过于简化的氨基酸替代模型有利于将黑粉菌与形成蘑菇的真菌分组,这通常导致在串联和共祖分析中都具有最大的自举支持。相反,最保守的数据子集将锈菌和相关类群与形成蘑菇的真菌分组在一起,尽管这种关系证明是不稳定的,对模型选择、不同的数据子集和缺失数据敏感。排除可能的长支分类群、具有高比例缺失数据和/或强信号的基因未能揭示一种拓扑结构或另一种拓扑结构的一致趋势,这表明还有其他冲突来源在起作用。虽然串联分析产生了强烈但相互矛盾的支持,但单个基因树对锈菌、黑粉菌和形成蘑菇的真菌的任何分辨率都提供了较差的支持,这表明真正的担子菌树可能处于难以使用串联和基于基因树的方法访问的树空间的一部分。基于推断的绝对模型拟合评估强烈拒绝了绝大多数基因的最佳拟合模型,表明即使是最常用的模型也拟合不良。虽然这与之前对氨基酸进化的位点同质模型的评估一致,但这似乎不是混淆信号的唯一来源。我们的分析表明,将黑粉菌与形成蘑菇的真菌联合起来的拓扑结构可能是由于对可能容易受到系统偏差影响的氨基酸位点进行不当建模而产生的。我们推测,改进的序列进化模型可以更清楚地揭示担子菌的基础分裂,尽管使用了全基因组数据,但目前这些分裂仍然没有得到解决。