De Amicis F, Marchetti S
Dipartimento di Produzione Vegetale e Tecnologie Agrarie, University of Udine, Via delle Scienze 208, 33100 Udine, Italy.
Nucleic Acids Res. 2000 Sep 1;28(17):3339-45. doi: 10.1093/nar/28.17.3339.
In this work, 710 CDSs corresponding to over 290 000 codons equally distributed between Brassica napus, Arabidopsis thaliana, Lycopersicon esculentum, Nicotiana tabacum, Pisum sativum, Glycine max, Oryza sativa, Triticum aestivum, Hordeum vulgare and Zea mays were considered. For each amino acid, synonymous codon choice was determined in the presence of A, G, C or T as the initial nucleotide of the subsequent triplet; data were statistically analysed under the hypothesis of an independent assortment of codons. In 33.4% of cases, a frequency significantly (P: = 0.01) different from that expected was recorded. This was mainly due to a pervasive intercodon TpA and CpG deficiency. As a general rule, intercodon TpAs and CpGs were preferably replaced by CpAs and TpGs, respectively. In several instances, codon frequencies were also modified to avoid homotetramer and homotrimer formation, to reduce intercodon ApCs downstream (1,2) GG or AG dinucleotides, as well as to increase GpA or ApG intercodons under certain contexts. Since TpA, CpG and homotetra(tri)mer deficiency directly or indirectly accounted for 77% of significant variation in the codon frequency, it can be concluded that codon usage mirrors precise needs at the DNA structure level. Plant species exhibited a phylogenetically-related adaptation to structural constraints. Codon usage flexibility was reflected in strikingly different arrays of optimum codons for probe design.
在本研究中,我们考虑了710个编码序列(CDS),这些序列对应超过290000个密码子,在甘蓝型油菜、拟南芥、番茄、烟草、豌豆、大豆、水稻、小麦、大麦和玉米之间平均分布。对于每种氨基酸,在后续三联体的起始核苷酸为A、G、C或T的情况下,确定同义密码子的选择;在密码子独立分配的假设下对数据进行统计分析。在33.4%的情况下,记录到的频率与预期频率有显著差异(P = 0.01)。这主要是由于普遍存在的密码子间TpA和CpG缺失。一般来说,密码子间的TpA和CpG分别优先被CpA和TpG取代。在几种情况下,密码子频率也会被修改,以避免同四聚体和同三聚体的形成,减少下游(1,2)GG或AG二核苷酸后的密码子间ApC,以及在某些情况下增加GpA或ApG密码子间序列。由于TpA、CpG和同四(三)聚体缺失直接或间接占密码子频率显著变化的77%,可以得出结论,密码子使用反映了DNA结构水平上的精确需求。植物物种在系统发育上表现出对结构限制的适应性。密码子使用的灵活性体现在用于探针设计的最佳密码子阵列有显著差异。