Liò P, Politi A, Buiatti M, Ruffo S
Dipartimento di Biologia Animale e Genetica, Università di Firenze, Italy.
J Theor Biol. 1996 May 21;180(2):151-60. doi: 10.1006/jtbi.1996.0091.
We have used an improved block-entropy measure in order to gain some further insights into the short-range correlations present in whole chromosomes of S. cerevisiae, viruses and organelles and very large genomic regions of E. coli. Although DNA sequences are largely inhomogeneous and word frequencies are unevenly distributed, the comparison of entire chromosomes and large genomic regions show a "bulk" composition homogeneity. This property suggests that biases in selection, directional mutational pressure and recombination processes act in homogenizing the base composition of the DNA molecules within a genome but their mode of action, relative impact and direction may vary in different organisms. The most interesting results appear to be the differences between the SW (C,G/A,T) and RY (A,G/C,T) two-letter alphabet entropies. Deviations from randomness in E. coli and S. cerevisiae sequences particularly concern SW dinucleotide frequencies and RY tetranucleotide frequencies.
我们使用了一种改进的信息熵度量方法,以便进一步深入了解酿酒酵母、病毒和细胞器的整条染色体以及大肠杆菌的非常大的基因组区域中存在的短程相关性。尽管DNA序列在很大程度上是不均匀的,且词频分布不均,但对整条染色体和大基因组区域的比较显示出“整体”组成的同质性。这一特性表明,选择偏向、定向突变压力和重组过程在使基因组内DNA分子的碱基组成同质化方面发挥作用,但其作用方式、相对影响和方向在不同生物体中可能有所不同。最有趣的结果似乎是SW(C,G/A,T)和RY(A,G/C,T)双字母字母表熵之间的差异。大肠杆菌和酿酒酵母序列中与随机性的偏差尤其涉及SW二核苷酸频率和RY四核苷酸频率。