Wang Liangjiang, Roossinck Marilyn J
Bioinformatics Center, Division of Biology, Kansas State University, Manhattan, KS 66506, USA.
Plant Mol Biol. 2006 Jul;61(4-5):699-710. doi: 10.1007/s11103-006-0041-8.
Codon usage bias is a ubiquitous phenomenon, which may be caused by mutational bias, selection, or both. The patterns of codon usage in plants are not well understood. Datasets of expressed sequence tags (ESTs) available for many plant species provide the resources for large-scale comparative analysis of codon usage patterns. We developed a computational approach to translate EST or assembled contig sequences, and then used the coding information for comparative analysis of codon usage in 12 plant species, including 6 eudicots, 5 monocots and the green alga Chlamydomonas reinhardtii. While codon nucleotide composition is highly conserved within eudicots or monocots, there is a significant difference between these two major taxonomic groups of higher plants. The third nucleotide position of codons is AU-rich in the eudicot genomes (35-42% of G+C content), but GC-rich in the monocot genomes (59-61% of G+C content). To identify optimal codons in these species, we used EST counts to estimate gene transcript levels. It was demonstrated that codon usage bias is correlated positively with gene transcript levels. Interestingly, the use of optimal codons appears to be well conserved between eudicots and monocots, and to a lesser degree between the higher plants and C. reinhardtii. Most of the optimal codons end with a C or G base, regardless of the different nucleotide composition in these genomes. The results suggest that plant codon usage is affected by translational selection, and the selective pressure appears to be conserved in the plant kingdom.
密码子使用偏好是一种普遍存在的现象,可能由突变偏好、选择或两者共同引起。植物中密码子的使用模式尚未得到充分了解。许多植物物种可用的表达序列标签(EST)数据集为大规模比较密码子使用模式提供了资源。我们开发了一种计算方法来翻译EST或组装的重叠群序列,然后利用编码信息对12种植物的密码子使用进行比较分析,其中包括6种双子叶植物、5种单子叶植物和绿藻莱茵衣藻。虽然双子叶植物或单子叶植物内部的密码子核苷酸组成高度保守,但高等植物的这两个主要分类群之间存在显著差异。双子叶植物基因组中密码子的第三个核苷酸位置富含AU(G+C含量为35-42%),而单子叶植物基因组中富含GC(G+C含量为59-61%)。为了确定这些物种中的最优密码子,我们使用EST计数来估计基因转录水平。结果表明,密码子使用偏好与基因转录水平呈正相关。有趣的是,最优密码子的使用在双子叶植物和单子叶植物之间似乎高度保守,在高等植物和莱茵衣藻之间的保守程度较低。无论这些基因组中的核苷酸组成如何不同,大多数最优密码子都以C或G碱基结尾。结果表明,植物密码子使用受翻译选择的影响,并且这种选择压力在植物界似乎是保守的。