Albrecht-Buehler Guenter
Department of Cell and Molecular Biology, Feinberg School of Medicine, Northwestern University, Chicago, IL 60611, USA.
Proc Natl Acad Sci U S A. 2006 Nov 21;103(47):17828-33. doi: 10.1073/pnas.0605553103. Epub 2006 Nov 8.
Chargaff's second parity rules for mononucleotides and oligonucleotides (CIImono and CIIoligo rules) state that a sufficiently long (> 100 kb) strand of genomic DNA that contains N copies of a mono- or oligonucleotide, also contains N copies of its reverse complementary mono- or oligonucleotide on the same strand. There is very strong support in the literature for the validity of the rules in coding and noncoding regions, especially for the CIImono rule. Because the experimental support for the CIIoligo rule is much less complete, the present article, focusing on the special case of trinucleotides (triplets), examined several gigabases of genome sequences from a wide range of species and kingdoms including organelles such as mitochondria and chloroplasts. I found that all genomes, with the only exception of certain mitochondria, complied with the CIItriplet rule at a very high level of accuracy in coding and noncoding regions alike. Based on the growing evidence that genomes may contain up to millions of copies of interspersed repetitive elements, I propose in this article a quantitative formulation of the hypothesis that inversions and inverted transposition could be a major contributing if not dominant factor in the almost universal validity of the rules.
查加夫关于单核苷酸和寡核苷酸的第二条奇偶规则(CII单核苷酸规则和CII寡核苷酸规则)指出,一条足够长(>100 kb)的基因组DNA链,若包含N个单核苷酸或寡核苷酸拷贝,则在同一条链上也包含N个其反向互补单核苷酸或寡核苷酸拷贝。文献中对这些规则在编码区和非编码区的有效性有非常有力的支持,尤其是对CII单核苷酸规则。由于对CII寡核苷酸规则的实验支持还远不够完善,本文聚焦于三核苷酸(三联体)的特殊情况,研究了来自广泛物种和生物界(包括线粒体和叶绿体等细胞器)的几个千兆碱基的基因组序列。我发现,除了某些线粒体之外,所有基因组在编码区和非编码区都以非常高的准确度符合CII三联体规则。基于越来越多的证据表明基因组可能包含多达数百万个散布的重复元件拷贝,我在本文中提出了一个定量假设,即倒位和反向转座如果不是主导因素,也可能是这些规则几乎普遍有效的一个主要促成因素。