Suppr超能文献

在深度系统发生基因组学中丝氨酸密码子使用偏好:以泛甲壳动物的亲缘关系作为案例研究

Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study.

机构信息

Department of Biology, The National University of Ireland, Maynooth, Co. Kildare, Ireland.

出版信息

Syst Biol. 2013 Jan 1;62(1):121-33. doi: 10.1093/sysbio/sys077. Epub 2012 Sep 6.

Abstract

Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine-TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a parallel, partially compositionally driven, synonymous codon-usage bias affects the nucleotide topology. As substitutions between serine codon families can proceed through threonine or cysteine intermediates, amino acid data sets might also be affected by the serine codon-usage bias. We suggest that a Dayhoff recoding strategy would partially ameliorate the effects of such bias. Although amino acids provide an alternative hypothesis of pancrustacean relationships, neither the nucleotides nor the amino acids version of this data set seems to bring enough genuine phylogenetic information to robustly resolve the relationships within group, which should still be considered unresolved.

摘要

对远古关系的系统基因组分析通常使用氨基酸数据进行,但不清楚是应该优先使用氨基酸还是核苷酸。为了解决这个问题并阐明泛甲壳动物的关系,我们探索了 Regier 等人于 2010 年精心组装的 62 个蛋白质编码基因中的信号。参考泛甲壳动物,该数据集推断出一个高度支持的核苷酸树,与相应的但支持度差的氨基酸树有很大的不同。我们表明,核苷酸树和基于氨基酸的树之间的差异是由同义密码子家族(尤其是丝氨酸-TCN 和 AGY)内的替换引起的。我们表明,不同的节肢动物谱系在其丝氨酸、精氨酸和亮氨酸同义密码子的使用上存在差异,并且丝氨酸的偏好与从核苷酸得出的拓扑结构相关,而与氨基酸无关。我们建议一种平行的、部分组成驱动的同义密码子使用偏好会影响核苷酸拓扑结构。由于丝氨酸密码子家族之间的替换可以通过苏氨酸或半胱氨酸中间体进行,因此氨基酸数据集也可能受到丝氨酸密码子使用偏好的影响。我们建议使用 Dayhoff 重编码策略可以部分减轻这种偏差的影响。虽然氨基酸提供了泛甲壳动物关系的另一种假设,但无论是核苷酸还是氨基酸版本的数据似乎都没有提供足够的真实系统发育信息来稳健地解决组内的关系,这些关系仍应被视为未解决的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验