Marquez Roberto, Smit Sandra, Knight Rob
Department of Computer Science, New Mexico State University, MSC CS, Las Cruces, NM 88003, USA.
Genome Biol. 2005;6(11):R91. doi: 10.1186/gb-2005-6-11-r91. Epub 2005 Oct 19.
Do species use codons that reduce the impact of errors in translation or replication? The genetic code is arranged in a way that minimizes errors, defined as the sum of the differences in amino-acid properties caused by single-base changes from each codon to each other codon. However, the extent to which organisms optimize the genetic messages written in this code has been far less studied. We tested whether codon and amino-acid usages from 457 bacteria, 264 eukaryotes, and 33 archaea minimize errors compared to random usages, and whether changes in genome G+C content influence these error values.
We tested the hypotheses that organisms choose their codon usage to minimize errors, and that the large observed variation in G+C content in coding sequences, but the low variation in G+U or G+A content, is due to differences in the effects of variation along these axes on the error value. Surprisingly, the biological distribution of error values has far lower variance than randomized error values, but error values of actual codon and amino-acid usages are actually greater than would be expected by chance.
These unexpected findings suggest that selection against translation error has not produced codon or amino-acid usages that minimize the effects of errors, and that even messages with very different nucleotide compositions somehow maintain a relatively constant error value. They raise the question: why do all known organisms use highly error-minimizing genetic codes, but fail to minimize the errors in the mRNA messages they encode?
物种是否会使用密码子来减少翻译或复制过程中错误的影响?遗传密码的排列方式能将错误降至最低,这里的错误定义为每个密码子单碱基变化导致的氨基酸特性差异之和。然而,生物体在多大程度上优化用这种密码编写的遗传信息,却很少被研究。我们测试了457种细菌、264种真核生物和33种古细菌的密码子和氨基酸使用情况与随机使用情况相比是否能将错误降至最低,以及基因组G+C含量的变化是否会影响这些错误值。
我们测试了以下假设:生物体选择其密码子使用方式以将错误降至最低,并且编码序列中观察到的G+C含量的巨大差异,但G+U或G+A含量的低差异,是由于沿这些轴的变化对错误值的影响不同所致。令人惊讶的是,错误值的生物学分布方差远低于随机化错误值,但实际密码子和氨基酸使用的错误值实际上大于随机预期值。
这些意外发现表明,针对翻译错误的选择并未产生能将错误影响降至最低的密码子或氨基酸使用方式,而且即使是核苷酸组成差异很大的信息也能以某种方式保持相对恒定的错误值。它们提出了一个问题:为什么所有已知生物体都使用高度错误最小化的遗传密码,但却未能将它们编码mRNA信息中的错误降至最低?