State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.
Mol Phylogenet Evol. 2023 Dec;189:107924. doi: 10.1016/j.ympev.2023.107924. Epub 2023 Sep 10.
Psyllids (class Insecta: order Hemiptera: superfamily Psylloidea) are a taxonomically and phylogenetically challenging clade. Recent studies have largely advanced the phylogeny of this group, yet the family-level relationships among Aphalaridae, Carsidaridae, and others remain unresolved. Genome-scale phylogenetic analysis is known to provide a finer resolution for problems like that. However, such phylogenomics also introduces new problems: incorrect trees with high confidence yielded due to systematic error (bias). Here we addressed these issues using hundreds of single-copy orthologous (SCO) genes in psyllid transcriptomes and genomes. Our analyses revealed conflicts between the nucleotide-based and amino-acid-based phylogenetic trees. While the nucleotide-based phylogeny strongly supported the (Aphalaridae + Carsidaridae) + Others relationship, the amino-acid-based one recovered Aphalaridae + (Carsidaridae + Others) with 100% support. Further inspection revealed significant compositional heterogeneity in nucleotide sequences for 67% of SCO genes, but not in the corresponding translated amino acid sequences. We then used different strategies to combat this compositional bias, and found that using the RY-coding strategy (coding the standard nucleotides as purines and pyrimidines) the nucleotide-based phylogeny became consistent with the amino-acid-based one. We further applied RY-coding to a published concatenated nucleotide dataset and recovered the Aphalaridae monophyly (which is refuted by the original literature on non-recoded sequences) at the base of psyllid tree. Moreover, it was found that variations in evolutionary rate could lead to errors in nucleotide-based phylogeny. The fast-evolving Heteropsylla cubana (Psyllidae: Ciriacreminae) was incorrectly placed within the subfamily Psyllinae. This bias can be avoided by using data removal or RY-coding strategies. Together, our results strongly support the family relationship of Aphalaridae + (Carsidaridae + Others), and show that the amino-acid-based concatenation analysis is more robust than nucleotide-based one. Future phylogenomic analysis of psyllid nucleotide sequences should take into account methods such as the RY-coding scheme to address potential systematic biases arising from composition and rate heterogeneities.
沫蝉(昆虫纲:半翅目:沫蝉总科)是一个在分类学和系统发育上具有挑战性的类群。最近的研究在很大程度上推进了该类群的系统发育,但蚜科、车桑子科等科之间的科级关系仍未解决。基因组规模的系统发育分析被认为可以更精细地解决此类问题。然而,这种系统发育基因组学也带来了新的问题:由于系统误差(偏差),产生了置信度高的错误树。在这里,我们使用沫蝉转录组和基因组中的数百个单拷贝直系同源(SCO)基因来解决这些问题。我们的分析揭示了核苷酸和基于氨基酸的系统发育树之间的冲突。虽然核苷酸系统发育树强烈支持(蚜科+车桑子科)+其他关系,但基于氨基酸的系统发育树恢复了蚜科+(车桑子科+其他),支持率为 100%。进一步检查发现,67%的 SCO 基因的核苷酸序列存在显著的组成异质性,但相应的翻译氨基酸序列则没有。然后,我们使用不同的策略来对抗这种组成偏差,发现使用 RY 编码策略(将标准核苷酸编码为嘌呤和嘧啶),核苷酸系统发育树与氨基酸系统发育树一致。我们进一步将 RY 编码应用于已发表的串联核苷酸数据集,并在沫蝉树的基部恢复了蚜科的单系性(这与原始文献中未编码序列的结论相矛盾)。此外,还发现进化率的变化可能导致核苷酸系统发育树中的错误。快速进化的古巴沫蝉(沫蝉科:沫蝉亚科)被错误地置于沫蝉亚科内。通过使用数据去除或 RY 编码策略可以避免这种偏差。总的来说,我们的结果强烈支持蚜科+(车桑子科+其他)的科级关系,并表明基于氨基酸的串联分析比基于核苷酸的分析更稳健。未来沫蝉核苷酸序列的系统发育基因组学分析应考虑使用 RY 编码方案等方法,以解决由组成和速率异质性引起的潜在系统偏差。