Zuckerkandl Emile
Institute of Molecular Medical Sciences, Stanford, CA 94309, USA.
Genetica. 2002 May;115(1):105-29. doi: 10.1023/a:1016080316076.
It is recalled that dispensability of sequences and neutral substitution rate must not be construed to be markers of nonfunctionality. Different aspects of functionality relate to differently-sized nucleotide communities. At the time cells became nucleated, a boom of epigenetic processes led to uses of DNA that required many more nucleotides operating collectively than do functions definable in terms of classical genetics. Each order of magnitude of nucleotide plurality was colonized by functions germane to that order. The eukaryote genome became a great epigenetic machine. Sequences of different levels of nucleotide plurality are briefly discussed from the point of view of their functional relevance. By their activities as both transcribed genes and cis-acting repeats, SINEs and LINEs are the principal link between genetic and epigenetic processes. SINEs can act as local repeats to produce position effect variegation (PEV) in a nearby gene. PEV may thus represent a general method of overall transcriptional regulation at the level of cell collectivities. When tracking the scale dependence of nucleotide function, one finds the 100 kb order of nucleotide plurality to provide epigenetically the basis at once for PEV, imprinting, and cell determination, with sectorial repressibility a trait common to the three. In sectorial repressibility, introns may play a structural role favoring the stability of higher-order chromatin structures. At that level of nucleotide involvement, nonconserved nonhomologous nonprotein-coding sequences may often play the same structural roles. In addition, genomic distance per se--and, therefore, the mass of intervening nucleotides--can have functional effects. Distances between enhancers and promoters need to be probed in this respect. At the 1,000 kb level of nucleotide function, attention is focused on the formation of centromeres. It is one of the levels of nucleotide plurality per function where specificity in the generation of DNA/protein complexes seems to depend more upon the structural fit among factors than upon the DNA sequence. This circumstance may explain in part the prevailing difficulty in recognizing the functional nature of sequences among non-protein-coding nucleotide arrays and the propensity among investigators to tag the majority of DNA sequences in higher organisms as functionally meaningless. Noncoding DNA often may not be 'selected' as an appropriate niche for a certain function, but be 'elected' in that capacity by a group of factors, as a preexisting sequence that is only now called upon to serve. Much of the non-protein-coding DNA may thus be only conditionally functional and in fact may never be elected to functions at a high level of nucleotide plurality. Eukaryotes are composites, at different levels of this plurality, of the functional and the nonfunctional, as well as of the conditionally functional and the outright functional. Thus, a sequence that is nonfunctional at one level of nucleotide plurality may participate in a functional sequence at a more inclusive level. In the end, every nucleotide is at least infinitesimally functional if, for metabolic and developmental reasons, the chromatin mass as such becomes a selectable entity. Given the scale dependence of nucleotide function, large amounts of 'junk DNA', contrary to common belief, must be assumed to contribute to the complexity of gene interaction systems and of organisms.
需要记住的是,序列的 dispensability 和中性替代率绝不能被解释为无功能的标志。功能的不同方面与不同大小的核苷酸群落相关。在细胞形成细胞核时,表观遗传过程的激增导致了 DNA 的使用,这些使用需要比经典遗传学定义的功能更多的核苷酸共同发挥作用。每个核苷酸数量级都被与其相关的功能所占据。真核生物基因组成为了一个巨大的表观遗传机器。从功能相关性的角度简要讨论了不同核苷酸数量级的序列。通过作为转录基因和顺式作用重复序列的活性,短散在核元件(SINEs)和长散在核元件(LINEs)是遗传和表观遗传过程之间的主要联系。SINEs 可以作为局部重复序列,在附近基因中产生位置效应斑驳(PEV)。因此,PEV 可能代表了细胞集体水平上整体转录调控的一种通用方法。在追踪核苷酸功能的尺度依赖性时,人们发现 100 kb 的核苷酸数量级为 PEV、印记和细胞决定提供了表观遗传基础,扇形可抑制性是这三者共有的特征。在扇形可抑制性中,内含子可能发挥有利于高阶染色质结构稳定性的结构作用。在那个核苷酸参与水平上,非保守的非同源非蛋白质编码序列可能经常发挥相同的结构作用。此外,基因组距离本身——以及因此中间核苷酸的数量——可能具有功能效应。在这方面需要探究增强子和启动子之间的距离。在 1000 kb 的核苷酸功能水平上,注意力集中在着丝粒的形成上。这是每个功能的核苷酸数量级之一,在这个水平上,DNA/蛋白质复合物生成中的特异性似乎更多地取决于因子之间的结构匹配,而不是 DNA 序列。这种情况可能部分解释了在识别非蛋白质编码核苷酸阵列中序列的功能性质时普遍存在的困难,以及研究人员倾向于将高等生物中大多数 DNA 序列标记为功能无意义的原因。非编码 DNA 通常可能不是作为某种功能的合适“生态位”被“选择”的,而是由一组因子作为一种预先存在的序列“挑选”出来承担该功能的,而这种序列直到现在才被要求发挥作用。因此,许多非蛋白质编码 DNA 可能只是有条件地发挥功能,实际上可能永远不会在高核苷酸数量级上被“挑选”去发挥功能。真核生物在这个数量级的不同水平上,是功能性和非功能性、有条件功能性和完全功能性的复合体。因此,在一个核苷酸数量级上无功能的序列可能在更广泛的水平上参与一个功能性序列。最后,如果由于代谢和发育原因,染色质整体成为一个可选择的实体,那么每个核苷酸至少都有极小的功能。鉴于核苷酸功能的尺度依赖性,与普遍看法相反,必须假定大量的“垃圾 DNA”有助于基因相互作用系统和生物体的复杂性。