Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA.
Nucleic Acids Res. 2012 Oct;40(18):9244-54. doi: 10.1093/nar/gks652. Epub 2012 Jul 11.
Human internal exons have an average size of 147 nt, and most are <300 nt. This small size is thought to facilitate exon definition. A small number of large internal exons have been identified and shown to be alternatively spliced. We identified 1115 internal exons >1000 nt in the human genome; these were found in 5% of all protein-coding genes, and most were expressed and translated. Surprisingly, 40% of these were expressed at levels similar to the flanking exons, suggesting they were constitutively spliced. While all of the large exons had strong splice sites, the constitutively spliced large exons had a higher ratio of splicing enhancers/silencers and were more conserved across mammals than the alternatively spliced large exons. We asked if large exons contain specific sequences that promote splicing and identified 38 sequences enriched in the large exons relative to small exons. The consensus sequence is C-rich with a central invariant CA dinucleotide. Mutation of these sequences in a candidate large exon indicated that these are important for recognition of large exons by the splicing machinery. We propose that these sequences are large exon splicing enhancers (LESEs).
人类内含子的平均大小为 147 个核苷酸,大多数小于 300 个核苷酸。这种小尺寸被认为有利于外显子的定义。已经确定了少数几个较大的内含子,并显示它们可以进行选择性剪接。我们在人类基因组中鉴定出了 1115 个大于 1000 个核苷酸的内含子;这些内含子存在于所有编码蛋白基因的 5%中,并且大多数都有表达和翻译。令人惊讶的是,其中 40%的表达水平与侧翼外显子相似,这表明它们是组成性剪接的。虽然所有的大外显子都有很强的剪接位点,但组成性剪接的大外显子具有更高的剪接增强子/沉默子比例,并且在哺乳动物中的保守性比选择性剪接的大外显子更高。我们想知道大外显子是否含有促进剪接的特定序列,并在大外显子中鉴定出了 38 个相对于小外显子富集的序列。保守序列富含 C,中央有一个不变的 CA 二核苷酸。在候选大外显子中突变这些序列表明,这些序列对于剪接机制识别大外显子是重要的。我们提出这些序列是大外显子剪接增强子(LESEs)。