Arquès D G, Michel C J
Friedrich Miescher Institut, Bioinformatic group, Basel, Switzerland.
Nucleic Acids Res. 1987 Sep 25;15(18):7581-92. doi: 10.1093/nar/15.18.7581.
The sequence information for the splicing process of introns is found in the consensus sequences at the two splice sites. For long introns, of 300 or more nucleotides, the middle regions may provide additional specificity for splicing which can be investigated by defining an adequate quantitative parameter. This methodology permits to retrieve the coding periodicity in the viral and mitochondrial introns and to identify with a statistical significance, a surprising alternating purine-pyrimidine base sequence -i.e. a modulo 2 periodicity- in the eukaryotic introns, and particularly in the vertebrate introns. This alternating structure suggests that the vertebrate introns do not have the genetic information to code for proteins, they carry structural and regulatory functions.
内含子剪接过程的序列信息存在于两个剪接位点的共有序列中。对于长度为300个或更多核苷酸的长内含子,中间区域可能为剪接提供额外的特异性,这可以通过定义一个适当的定量参数来研究。这种方法能够检索病毒和线粒体内含子中的编码周期性,并以统计学意义识别真核生物内含子,特别是脊椎动物内含子中令人惊讶的嘌呤-嘧啶碱基交替序列,即模2周期性。这种交替结构表明脊椎动物内含子不具备编码蛋白质的遗传信息,它们具有结构和调节功能。