Bell S J, Forsdyke D R
Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L 3N6.
J Theor Biol. 1999 Mar 7;197(1):51-61. doi: 10.1006/jtbi.1998.0857.
Chargaff's first parity rule (%A=%T and %G=%C) is explained by the Watson-Crick model for duplex DNA in which complementary base pairs form individual accounting units. Chargaff's second parity rule is that the first rule also applies to single strands of DNA. The limits of accounting units in single strands were examined by moving windows of various sizes along sequences and counting the relative proportions of A and T (the W bases), and of C and G (the S bases). Shuffled sequences account, on average, over shorter regions than the corresponding natural sequence. For an E. coli segment, S base accounting is, on average, contained within a region of 10 kb, whereas W base accounting requires regions in excess of 100 kb. Accounting requires the entire genome (190 kb) in the case of Vaccinia virus, which has an overall "Chargaff difference" of only 0.086% (i.e. only one in 1162 bases does not have a potential pairing partner in the same strand). Among the chromosomes of Saccharomyces cerevisiae, the total Chargaff differences for the W bases and for the S bases are usually correlated. In general, Chargaff differences for a natural sequence and its shuffled counterpart diverge maximally when 1 kb sequence windows are employed. This should be the optimum window size for examining correlations between Chargaff differences and sequence features which have arisen through natural selection. We propose that Chargaff's second parity rule reflects the evolution of genome-wide stem-loop potential as part of short- and long-range accounting processes which work together to sustain the integrity of various levels of information in DNA.
查加夫第一互补规则(%A=%T且%G=%C)可由沃森-克里克双链DNA模型解释,其中互补碱基对构成独立的核算单位。查加夫第二互补规则是第一规则也适用于单链DNA。通过沿序列移动不同大小的窗口并计算A和T(W碱基)以及C和G(S碱基)的相对比例,研究了单链中核算单位的限度。随机排列的序列在比相应自然序列更短的区域内平均符合规则。对于大肠杆菌片段,S碱基核算平均包含在10 kb的区域内,而W碱基核算需要超过100 kb的区域。对于痘苗病毒,核算需要整个基因组(190 kb),其总体“查加夫差异”仅为0.086%(即1162个碱基中只有一个在同一条链中没有潜在的配对伙伴)。在酿酒酵母的染色体中,W碱基和S碱基的总查加夫差异通常是相关的。一般来说,当采用1 kb序列窗口时,自然序列及其随机排列对应物的查加夫差异最大程度地发散。这应该是用于检查查加夫差异与通过自然选择产生的序列特征之间相关性的最佳窗口大小。我们提出,查加夫第二互补规则反映了全基因组茎环潜力的进化,这是短程和长程核算过程的一部分,这些过程共同作用以维持DNA中各级信息的完整性。