Xiao Guohua, Tang Guirong, Wang Chengshu
School of Computer Science, Fudan University, Shanghai 200433, China.
CAS Key Laboratory of Insect Developmental and Evolutionary Biology, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai 200032, China.
J Fungi (Basel). 2020 Aug 13;6(3):134. doi: 10.3390/jof6030134.
Amid the genomic data explosion, phylogenomic analysis has resolved the tree of life of different organisms, including fungi. Genome-wide clustering has also been conducted based on gene content data that can lighten the issue of the unequal evolutionary rate of genes. In this study, using different fungal species as models, we performed phylogenomic and protein-content (PC)-based clustering analysis. The obtained sequence tree reflects the phylogenetic trajectory of examined fungal species. However, 15 PC-based trees constructed from the Pfam matrices of the whole genomes, four protein families, and ten subcellular locations largely failed to resolve the speciation relationship of cross-phylum fungal species. However, lifestyle and taxonomic associations were more or less evident between closely related fungal species from PC-based trees. Pairwise congruence tests indicated that a varied level of congruent or discordant relationships were observed between sequence- and PC-based trees, and among PC-based trees. It was intriguing to find that a few protein family and subcellular PC-based trees were more topologically similar to the phylogenomic tree than was the whole genome PC-based phylogeny. In particular, a most significant level of congruence was observed between sequence- and cell wall PC-based trees. Cophylogenetic analysis conducted in this study may benefit the prediction of the magnitude of evolutionary conservation, interactive associations, or networking between different family or subcellular proteins.
在基因组数据爆炸的背景下,系统发育基因组学分析解析了包括真菌在内的不同生物体的生命之树。基于基因含量数据也进行了全基因组聚类,这些数据可以减轻基因进化速率不均等的问题。在本研究中,我们以不同的真菌物种为模型,进行了系统发育基因组学和基于蛋白质含量(PC)的聚类分析。得到的序列树反映了所研究真菌物种的系统发育轨迹。然而,从全基因组、四个蛋白质家族和十个亚细胞定位的 Pfam 矩阵构建的 15 个基于 PC 的树在很大程度上未能解析跨门真菌物种的物种形成关系。然而,基于 PC 的树中,亲缘关系较近的真菌物种之间的生活方式和分类学关联或多或少是明显的。成对一致性检验表明,在基于序列的树和基于 PC 的树之间,以及基于 PC 的树之间,观察到了不同程度的一致或不一致关系。有趣的是,发现一些基于蛋白质家族和亚细胞 PC 的树在拓扑结构上比基于全基因组 PC 的系统发育树更类似于系统发育基因组树。特别是,基于序列的树和基于细胞壁 PC 的树之间观察到了最高水平的一致性。本研究中进行的共系统发育分析可能有助于预测不同家族或亚细胞蛋白质之间的进化保守程度、相互作用关联或网络。