Zuckerkandl E
Mol Biol Rep. 1981 May 22;7(1-3):149-58. doi: 10.1007/BF00778746.
It is proposed that a general function of noncoding DNA and RNA sequences in higher organisms (intergenic and intervening sequences) is to provide multiple binding sites over long stretches of polynucleotide for certain types of regulatory proteins. Through the building up or abolishing of high-order structures, these proteins either sequester sites for the control of, e.g., transcription or make the sites available to local molecular signals. If this is to take place, the existence of a 'c-value paradox' becomes a requirement. Multiple binding sites for a given protein may recur in the form of a sequence 'motif' that is variable within certain limits. Noncoding sequences of the chickens ovalbumin gene furnish an appropriate example of a sequence motif. GAAAATT. Its improbably high frequency and significant periodicity are both absent from the coding sequences of the same gene and from the noncoding sequences of a differently controlled gene in the same organisms, the preproinsulin gene. This distribution of a sequence motif is in keeping with the concepts outlined. Low specificity of sequences that bind protein is likely to be compatible with highly specific conformational changes.
有人提出,高等生物中非编码DNA和RNA序列(基因间序列和间隔序列)的一般功能是在长链多核苷酸上为某些类型的调节蛋白提供多个结合位点。通过构建或消除高阶结构,这些蛋白质要么隔离用于控制(例如转录)的位点,要么使这些位点可被局部分子信号利用。如果要发生这种情况,“C值悖论”的存在就成为必要条件。给定蛋白质的多个结合位点可能以序列“基序”的形式重复出现,该序列在一定限度内是可变的。鸡卵清蛋白基因的非编码序列提供了一个序列基序的合适例子。GAAAATT。同一基因的编码序列以及同一生物体中另一个受不同调控的基因(前胰岛素原基因)的非编码序列中都不存在其异常高的频率和显著的周期性。这种序列基序的分布与上述概念相符。与蛋白质结合的序列的低特异性可能与高度特异性的构象变化相容。