同义替换显著改善了从高度分化的蛋白质进行的进化推断。

Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins.

作者信息

Seo Tae-Kun, Kishino Hirohisa

机构信息

Professional Programme for Agricultural Bioinformatics, Graduate School of Agricultural and Life Sciences, University of Tokyo, Tokyo, Japan.

出版信息

Syst Biol. 2008 Jun;57(3):367-77. doi: 10.1080/10635150802158670.

DOI:10.1080/10635150802158670

PMID:18570032

Abstract

Codon-and amino acid-substitution models are widely used for the evolutionary analysis of protein-coding DNA sequences. Using codon models, the amounts of both nonsynonymous and synonymous DNA substitutions can be estimated. The ratio of these amounts represents the strength of selective pressure. Using amino acid models, the amount of nonsynonymous substitutions is estimated, but that of synonymous substitutions is ignored. Although amino acid models lose any information regarding synonymous substitutions, they explicitly incorporate the information for amino acid replacement, which is empirically derived from databases. It is often presumed that when the protein-coding sequences are highly divergent, synonymous substitutions might be saturated and the evolutionary analysis may be hampered by synonymous noise. However, there exists no quantitative procedure to verify whether synonymous substitutions can be ignored; therefore, amino acid models have been arbitrarily selected. In this study, we investigate the issue of a statistical comparison between codon-and amino acid-substitution models. For this purpose, we propose a new procedure to transform a 20-dimensional amino acid model to a 61-dimensional codon model. This transformation reveals that amino acid models belong to a subset of the codon models and enables us to test whether synonymous substitutions can be ignored by using the likelihood ratio. Our theoretical results and analyses of real data indicate that synonymous substitutions are very informative and substantially improve evolutionary inference, even when the sequences are highly divergent. Therefore, we note that amino acid models should be adopted only after carefully investigating and discarding the possibility that synonymous substitutions can reveal important evolutionary information.

摘要

密码子和氨基酸替换模型广泛用于蛋白质编码DNA序列的进化分析。使用密码子模型，可以估计非同义替换和同义替换的数量。这些数量的比率代表选择压力的强度。使用氨基酸模型，可以估计非同义替换的数量，但忽略同义替换的数量。尽管氨基酸模型丢失了关于同义替换的任何信息，但它们明确纳入了从数据库中经验性得出的氨基酸替换信息。人们通常认为，当蛋白质编码序列高度分化时，同义替换可能会饱和，进化分析可能会受到同义噪声的阻碍。然而，目前还没有定量程序来验证同义替换是否可以忽略；因此，氨基酸模型是被随意选择的。在本研究中，我们调查了密码子和氨基酸替换模型之间的统计比较问题。为此，我们提出了一种将20维氨基酸模型转换为61维密码子模型的新程序。这种转换表明氨基酸模型属于密码子模型的一个子集，并使我们能够使用似然比来测试同义替换是否可以忽略。我们的理论结果和对实际数据的分析表明，即使序列高度分化，同义替换也非常有信息价值，并能显著改善进化推断。因此，我们指出，只有在仔细研究并排除同义替换可能揭示重要进化信息的可能性之后，才应采用氨基酸模型。

相似文献

Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins.

Syst Biol. 2008 Jun;57(3):367-77. doi: 10.1080/10635150802158670.

A combined empirical and mechanistic codon model.

Mol Biol Evol. 2007 Feb;24(2):388-97. doi: 10.1093/molbev/msl175. Epub 2006 Nov 16.

Site-to-site variation of synonymous substitution rates.

Mol Biol Evol. 2005 Dec;22(12):2375-85. doi: 10.1093/molbev/msi232. Epub 2005 Aug 17.

An empirical codon model for protein sequence evolution.

Mol Biol Evol. 2007 Jul;24(7):1464-79. doi: 10.1093/molbev/msm064. Epub 2007 Mar 30.

Selective pressures at a codon-level predict deleterious mutations in human disease genes.

J Mol Biol. 2006 May 19;358(5):1390-404. doi: 10.1016/j.jmb.2006.02.067. Epub 2006 Mar 15.

Large-scale analyses of synonymous substitution rates can be sensitive to assumptions about the process of mutation.

Gene. 2006 Aug 15;378:58-64. doi: 10.1016/j.gene.2006.04.024. Epub 2006 May 22.

Empirical models for substitution in ribosomal RNA.

Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.

Empirical codon substitution matrix.

BMC Bioinformatics. 2005 Jun 1;6:134. doi: 10.1186/1471-2105-6-134.

Computing Ka and Ks with a consideration of unequal transitional substitutions.

BMC Evol Biol. 2006 Jun 2;6:44. doi: 10.1186/1471-2148-6-44.

Modelling the evolution of protein coding sequences sampled from Measurably Evolving Populations.

Genome Inform. 2008;21:150-64.

引用本文的文献

Relaxation of Natural Selection in the Evolution of the Giant Lungfish Genomes.

Mol Biol Evol. 2023 Sep 1;40(9). doi: 10.1093/molbev/msad193.

DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies.

Syst Biol. 2023 Nov 1;72(5):1119-1135. doi: 10.1093/sysbio/syad036.

Next-generation development and application of codon model in evolution.

Front Genet. 2023 Jan 27;14:1091575. doi: 10.3389/fgene.2023.1091575. eCollection 2023.

Measuring Phylogenetic Information of Incomplete Sequence Data.

Syst Biol. 2022 Apr 19;71(3):630-648. doi: 10.1093/sysbio/syab073.

Ambiguity Coding Allows Accurate Inference of Evolutionary Parameters from Alignments in an Aggregated State-Space.

Syst Biol. 2021 Jan 1;70(1):21-32. doi: 10.1093/sysbio/syaa036.

Single-Copy Genes as Molecular Markers for Phylogenomic Studies in Seed Plants.

Genome Biol Evol. 2017 May 1;9(5):1130-1147. doi: 10.1093/gbe/evx070.

The First Chloroplast Genome Sequence of Boswellia sacra, a Resin-Producing Plant in Oman.

PLoS One. 2017 Jan 13;12(1):e0169794. doi: 10.1371/journal.pone.0169794. eCollection 2017.

Comparative Analysis of the Complete Chloroplast Genomes of Five Quercus Species.

Front Plant Sci. 2016 Jun 28;7:959. doi: 10.3389/fpls.2016.00959. eCollection 2016.

Trends in substitution models of molecular evolution.

Front Genet. 2015 Oct 26;6:319. doi: 10.3389/fgene.2015.00319. eCollection 2015.

AlignWise: a tool for identifying protein-coding sequence and correcting frame-shifts.

BMC Bioinformatics. 2015 Nov 9;16:376. doi: 10.1186/s12859-015-0813-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

同义替换显著改善了从高度分化的蛋白质进行的进化推断。

Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献