Blaisdell B E
J Mol Evol. 1983;19(2):122-33. doi: 10.1007/BF02300750.
Coding sequences of eucaryotic nuclear DNA were characterized by an excess of short runs and a deficit of long runs of weak and of strong hydrogen bounding bases; non-coding sequences by a deficit of short runs and an excess of long runs, in the same of purines and of pyrimidines. The conservation of these attributes across DNA sequences coding for proteins of widely different function, across widely different eucaryotic species for the same protein and across related genes that diverged a long time ago and that now show large differences in base and, if coding, amino acid sequence suggested that these attributes have survival value. It was concluded that these attributes constitute probalistic constraints on the primary structure (base sequence) of both coding and non-coding DNA.
真核细胞核DNA的编码序列具有以下特征:短串联重复序列过多,以及弱氢键和强氢键碱基的长串联重复序列不足;非编码序列则相反,短串联重复序列不足,长串联重复序列过多,嘌呤和嘧啶的情况相同。这些特征在编码功能差异很大的蛋白质的DNA序列中、在不同真核物种中编码同一蛋白质的序列中,以及在很久以前就发生分歧且现在碱基和(如果是编码序列,则氨基酸序列)存在很大差异的相关基因中都保持一致,这表明这些特征具有生存价值。得出的结论是,这些特征构成了对编码和非编码DNA一级结构(碱基序列)的概率性限制。