Suppr超能文献

对卷柏组(卷柏科)的质体系统基因组分析揭示了序列类型、异常基因和广泛的 RNA 编辑导致的冲突特征。

Plastid phylogenomic analyses of the Selaginella sanguinolenta group (Selaginellaceae) reveal conflict signatures resulting from sequence types, outlier genes, and pervasive RNA editing.

机构信息

State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China.

State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China.

出版信息

Mol Phylogenet Evol. 2022 Aug;173:107507. doi: 10.1016/j.ympev.2022.107507. Epub 2022 May 16.

Abstract

Different from the generally conserved plastomes (plastid genomes) of most land plants, the Selaginellaceae plastomes exhibit dynamic structure, high GC content and high substitution rates. Previous plastome analyses identified strong conflict on several clades in Selaginella, however the factors causing the conflictions and the impact on the phylogenetic inference have not been sufficiently investigated. Here, we dissect the distribution of phylogenetic signals and conflicts in Selaginella sanguinolenta group, the plastome of which is DR (direct repeats) structure and with genome-wide RNA editing. We analyzed the data sets including 22 plastomes representing all species of the S. sanguinolenta group, covering the entire geographical distribution from the Himalayas to Siberia and the Russian Far East regions. We recovered four different topologies by applying multispecies coalescent (ASTRAL) and concatenation methods (IQ-TREE and RAxML) on four data sets of PC (protein-coding genes), NC (non-coding sequences), PCN (the concatenated PC and NC), and RC (predicted RNA editing sites "C" were corrected by "T"), respectively. Six monophyletic clades, S. nummularifolia clade, S. rossii clade, S. sajanensis clade, S. sanguinolenta I clade, S. sanguinolenta II clade, and S. sanguinolenta III clade, were consistently resolved and supported by the characteristics of GC content, RNA editing frequency, and gene content. However, the relationships among these clades varied across the four topologies. To explore the underlying causes of the uncertainty, we compared the phylogenetic signals of the four topologies. We identified that the sequence types (coding versus non-coding), outlier genes (genes with extremely high |ΔGLS| values), and C-to-U RNA editing frequency in the protein-coding genes were responsible for the unstable phylogenomic relationship. We further revealed a significant positive correlation between the |ΔGLS| values and the variation coefficient of the RNA editing number. Our results demonstrated that the coalescent method performed better than the concatenation method in overcoming the problems caused by outlier genes and extreme RNA editing events. Our study particularly focused on the importance of exploring the plastid phylogenomic conflicts and suggested conducting concatenated analyses cautiously when adopting organelle genome data.

摘要

不同于大多数陆生植物中普遍保守的质体基因组(质体基因组),卷柏科的质体基因组表现出动态结构、高 GC 含量和高替代率。以前的质体分析在卷柏属中确定了几个分支上存在强烈的冲突,但导致冲突的因素以及对系统发育推断的影响尚未得到充分研究。在这里,我们剖析了具有 DR(直接重复)结构和全基因组 RNA 编辑的卷柏 sanguinolenta 组的系统发育信号和冲突的分布。我们分析了包含代表 sanguinolenta 组所有物种的 22 个质体的数据集,涵盖了从喜马拉雅山脉到西伯利亚和俄罗斯远东地区的整个地理分布。我们通过在四个数据集上应用多物种合并(ASTRAL)和串联方法(IQ-TREE 和 RAxML),分别在四个数据集 PC(蛋白编码基因)、NC(非编码序列)、PCN(串联的 PC 和 NC)和 RC(预测的 RNA 编辑位点“C”由“T”校正)上恢复了四种不同的拓扑结构。六个单系分支,包括 nummularifolia 分支、rossii 分支、sajanensis 分支、I 型 sanguinolenta 分支、II 型 sanguinolenta 分支和 III 型 sanguinolenta 分支,通过 GC 含量、RNA 编辑频率和基因含量的特征一致得到解决和支持。然而,这些分支之间的关系在四种拓扑结构中各不相同。为了探索不确定性的潜在原因,我们比较了四种拓扑结构的系统发育信号。我们确定序列类型(编码与非编码)、外显子基因(具有极高|ΔGLS|值的基因)和蛋白编码基因中的 C 到 U RNA 编辑频率是不稳定系统发育关系的原因。我们进一步揭示了|ΔGLS|值与 RNA 编辑数量变化系数之间存在显著正相关。我们的结果表明,合并方法在克服外显子基因和极端 RNA 编辑事件引起的问题方面表现优于串联方法。我们的研究特别关注探索质体基因组系统发育冲突的重要性,并建议在采用细胞器基因组数据时谨慎进行串联分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验