Naora H, Miyahara K, Curnow R N
Proc Natl Acad Sci U S A. 1987 Sep;84(17):6195-9. doi: 10.1073/pnas.84.17.6195.
The total amount of noncoding sequences on chromosomes of contemporary organisms varies significantly from species to species. We propose a hypothesis for the origin of these noncoding sequences that assumes that (i) an approximately equal to 0.55-kilobase (kb)-long reading frame composed the primordial gene and (ii) a 20-kb-long single-stranded polynucleotide is the longest molecule (as a genome) that was polymerized at random and without a specific template in the primordial soup/cell. The statistical distribution of stop codons allows examination of the probability of generating reading frames of approximately equal to 0.55 kb in this primordial polynucleotide. This analysis reveals that with three stop codons, a run of at least 0.55-kb equivalent length of nonstop codons would occur in 4.6% of 20-kb-long polynucleotide molecules. We attempt to estimate the total amount of noncoding sequences that would be present on the chromosomes of contemporary species assuming that present-day chromosomes retain the prototype primordial genome structure. Theoretical estimates thus obtained for most eukaryotes do not differ significantly from those reported for these specific organisms, with only a few exceptions. Furthermore, analysis of possible stop-codon distributions suggests that life on earth would not exist, at least in its present form, had two or four stop codons been selected early in evolution.
当代生物体染色体上非编码序列的总量因物种而异。我们提出了一个关于这些非编码序列起源的假说,该假说假定:(i)一个长度约为0.55千碱基(kb)的阅读框构成了原始基因;(ii)一个20 kb长的单链多核苷酸是在原始汤/细胞中随机聚合且没有特定模板的最长分子(作为基因组)。终止密码子的统计分布使得我们能够检验在这个原始多核苷酸中产生长度约为0.55 kb的阅读框的概率。该分析表明,对于三个终止密码子,在20 kb长的多核苷酸分子中,4.6%会出现至少0.55 kb等效长度的不间断密码子序列。假设现代染色体保留了原始基因组的原型结构,我们试图估算当代物种染色体上存在的非编码序列的总量。由此获得的大多数真核生物的理论估算值与这些特定生物体的报道值没有显著差异,只有少数例外。此外,对可能的终止密码子分布的分析表明,如果在进化早期选择了两个或四个终止密码子,地球上至少以目前形式存在的生命将不存在。