Sueoka Noboru
Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO 80309-0347, USA.
Gene. 2002 Oct 30;300(1-2):141-54. doi: 10.1016/s0378-1119(02)01046-6.
The intra-strand Parity Rule 2 of DNA (PR2) states that A=T and G=C within each strands. Useful corollaries of PR2 are G/(G+C)=A/(A+T)=0.5, G/(G+A)=C/(C+T)=G+C, G/(G+T)=C/(C+A)=G+C. Here. A, T, G, and C represent relative contents of the four nucleotide residues in a specific strand of DNA, so that A+T+G+C=1. Thus, deviations from the PR2 is a sign of strand-specific (or asymmetric) mutation and/or selection pressures. The present study delineates the symmetric and asymmetric effects of mutations on the intra-genomic heterogeneity of the G+C content in the human genome. The results of this study on the human genome are: (1) When both two- and four-codon amino acids were combined, only slight departures from the PR2 were observed in the total ranges of G+C content of the third-codon position. Thus, the G+C heterogeneity is likely to be caused by symmetric mutagenesis between the two strands. (2) The above result makes the deamination of cytosine due to double-strand breathing of DNA [Mol. Biol. Evol. 17 (2000) 1371] and/or incorporation of the oxidized guanine (8-oxo-guanine) opposite adenine during DNA replication (dGTP-oxidation hypothesis) as the most likely candidates for the major cause of the diversities of the G+C content. (3) Patterns of amino acid-specific PR2-biases detected by plotting PR2 corollaries against the G+C content of third codon position revealed that eight four-codon amino acids can be divided into three types by the second codon letter: (a) C(2)-type (Ala, Pro, Ser4, and Thr), (b) G(2)-type (Arg4 and Gly), and (c) T(2)-type (Leu4 and Val). (4) Most of the asymmetric plot patterns of the above three classes in PR2 biases can be explained by C(2)-->T(2) deamination of C(2)pG(3) of C(2)-type to T(2)pG(3) (T(2)-type) in both human and chicken. This explains the existence of some preferred codons in human and chicken. However, these biases (asymmetric) hardly contribute to the overall G+C content diversity of the third codon position.
DNA的链内奇偶规则2(PR2)表明,每条链内A=T且G=C。PR2的有用推论包括G/(G+C)=A/(A+T)=0.5,G/(G+A)=C/(C+T)=G+C,G/(G+T)=C/(C+A)=G+C。这里,A、T、G和C代表DNA特定链中四种核苷酸残基的相对含量,因此A+T+G+C=1。所以,偏离PR2是链特异性(或不对称)突变和/或选择压力的一个标志。本研究描述了突变对人类基因组中G+C含量的基因组内异质性的对称和不对称影响。这项关于人类基因组的研究结果如下:(1)当二联体密码子和四联体密码子氨基酸组合在一起时,在第三密码子位置的G+C含量的总范围内,仅观察到与PR2有轻微偏差。因此,G+C异质性可能是由两条链之间的对称诱变引起的。(2)上述结果使得由于DNA双链呼吸导致的胞嘧啶脱氨基作用[《分子生物学与进化》17(2000)1371]和/或DNA复制过程中腺嘌呤对面的氧化鸟嘌呤(8-氧代鸟嘌呤)的掺入(dGTP氧化假说)成为G+C含量多样性的主要原因的最可能候选因素。(3)通过将PR2推论与第三密码子位置的G+C含量作图检测到的氨基酸特异性PR2偏差模式表明,八个四联体密码子氨基酸可根据第二密码子字母分为三种类型:(a)C(2)型(丙氨酸、脯氨酸、丝氨酸4和苏氨酸),(b)G(2)型(精氨酸4和甘氨酸),以及(c)T(2)型(亮氨酸4和缬氨酸)。(4)上述PR2偏差的三类中大多数不对称作图模式可以通过人类和鸡中C(2)型的C(2)pG(3)脱氨基变为T(2)pG(3)(T(2)型)来解释。这解释了人类和鸡中一些偏爱密码子的存在。然而,这些偏差(不对称)几乎对第三密码子位置的整体G+C含量多样性没有贡献。