Karlin S, Mrázek J
Department of Mathematics Stanford University, CA 94305-2125, USA.
J Mol Biol. 1996 Oct 4;262(4):459-72. doi: 10.1006/jmbi.1996.0528.
Synonymous codon usage is based and the bias seems to be different in different organisms. Factors with proposed roles in causing codon bias include degree and timing of gene expression, codon-anticodon interactions, transcription and translation rate and fidelity, codon context, and global and local G + C content. We offer a new perspective and new methods for elucidating codon choices applied especially to the human genome. We present data supporting the thesis that codon choices for human genes are largely a consequence of two factors: (1) amino acid constraints, (2) maintaining DNA structures dependent on base-step conformational tendencies consistent with the organism's genome signature that is determined by genome-wide processes of DNA modification, replication and repair. The related codon signature defined as the dinucleotide relative abundances at the distinct codon positions (1,2), (2,3), and (3,4) (4 = 1 of the next codon) accommodates both the global genome signature and amino acid constraints. In human genes, codon positions (2,3) and (3,4) containing the silent site have similar codon signatures reflecting DNA symmetry. Strong CG and TA dinucleotide underrepresentation is observed at all codon positions as well as in non-coding regions. Estimates of synonymous codon usage based on codon signatures are in excellent agreement with the actual codon usage in human and general vertebrate genes. These properties are largely independent of the isochore compartment (G + C content), gene size, and transcriptional and translational constraints. We hypothesize that major influences on codon usage in human genes result from residue preferences and diresidue associations in proteins coupled to biases on the DNA level, related to replication and repair processes and/or DNA structural requirements.
同义密码子的使用是有依据的,并且这种偏好似乎在不同生物体中有所不同。在导致密码子偏好方面被认为起作用的因素包括基因表达的程度和时间、密码子 - 反密码子相互作用、转录和翻译速率及保真度、密码子上下文以及全局和局部的G + C含量。我们提供了一种新的视角和方法来阐明密码子选择,尤其适用于人类基因组。我们展示的数据支持这样一个论点,即人类基因的密码子选择很大程度上是两个因素的结果:(1)氨基酸限制,(2)维持依赖于碱基步构象倾向的DNA结构,这种倾向与由DNA修饰、复制和修复的全基因组过程所决定的生物体基因组特征一致。定义为在不同密码子位置(1,2)、(2,3)和(3,4)(4是下一个密码子的第1位)的二核苷酸相对丰度的相关密码子特征,兼顾了全局基因组特征和氨基酸限制。在人类基因中,包含沉默位点的密码子位置(2,3)和(3,4)具有相似的密码子特征,反映了DNA对称性。在所有密码子位置以及非编码区域都观察到强烈的CG和TA二核苷酸低丰度现象。基于密码子特征对同义密码子使用的估计与人类和一般脊椎动物基因中的实际密码子使用情况高度一致。这些特性在很大程度上独立于等容区(G + C含量)、基因大小以及转录和翻译限制。我们推测,对人类基因密码子使用的主要影响源于蛋白质中的残基偏好和双残基关联,再加上DNA水平上与复制和修复过程及/或DNA结构要求相关的偏好。