Abebrese Emmanuel L, Ali Syed H, Arnold Zachary R, Andrews Victoria M, Armstrong Katharine, Burns Lindsay, Crowder Hannah R, Day R Thomas, Hsu Daniel G, Jarrell Katherine, Lee Grace, Luo Yi, Mugayo Daphine, Raza Zain, Friend Kyle
Department of Chemistry and Biochemistry, Washington and Lee University, Lexington, Virginia, United States of America.
PLoS One. 2017 May 17;12(5):e0175393. doi: 10.1371/journal.pone.0175393. eCollection 2017.
Canonical pre-mRNA splicing requires snRNPs and associated splicing factors to excise conserved intronic sequences, with a minimum intron length required for efficient splicing. Non-canonical splicing-intron excision without the spliceosome-has been documented; most notably, some tRNAs and the XBP1 mRNA contain short introns that are not removed by the spliceosome. There have been some efforts to identify additional short introns, but little is known about how many short introns are processed from mRNAs. Here, we report an approach to identify RNA short introns from RNA-Seq data, discriminating against small genomic deletions. We identify hundreds of short introns conserved among multiple human cell lines. These short introns are often alternatively spliced and are found in a variety of RNAs-both mRNAs and lncRNAs. Short intron splicing efficiency is increased by secondary structure, and we detect both canonical and non-canonical short introns. In many cases, splicing of these short introns from mRNAs is predicted to alter the reading frame and change protein output. Our findings imply that standard gene prediction models which often assume a lower limit for intron size fail to predict short introns effectively. We conclude that short introns are abundant in the human transcriptome, and short intron splicing represents an added layer to mRNA regulation.
经典的前体mRNA剪接需要小核核糖核蛋白(snRNPs)和相关的剪接因子来切除保守的内含子序列,高效剪接需要最小的内含子长度。已经有文献报道了无剪接体的非经典剪接——内含子切除;最值得注意的是,一些tRNA和XBP1 mRNA含有不被剪接体去除的短内含子。已经有人努力去识别其他的短内含子,但对于从mRNA加工而来的短内含子数量知之甚少。在这里,我们报告了一种从RNA测序数据中识别RNA短内含子的方法,以区分小的基因组缺失。我们在多种人类细胞系中识别出数百个保守的短内含子。这些短内含子经常发生可变剪接,并且存在于多种RNA中——包括mRNA和长链非编码RNA(lncRNA)。短内含子的剪接效率因二级结构而提高,并且我们检测到了经典和非经典的短内含子。在许多情况下,从mRNA中剪接这些短内含子预计会改变阅读框并改变蛋白质输出。我们的发现表明,通常假设内含子大小有下限的标准基因预测模型无法有效地预测短内含子。我们得出结论,短内含子在人类转录组中很丰富,并且短内含子剪接代表了mRNA调控的一个附加层面。