Department of Bioengineering and Bioinformatics, Moscow State University, Moscow, 119992, GSP-2 Russia.
RNA. 2012 Jan;18(1):1-15. doi: 10.1261/rna.029249.111. Epub 2011 Nov 29.
Pre-mRNA structure impacts many cellular processes, including splicing in genes associated with disease. The contemporary paradigm of RNA structure prediction is biased toward secondary structures that occur within short ranges of pre-mRNA, although long-range base-pairings are known to be at least as important. Recently, we developed an efficient method for detecting conserved RNA structures on the genome-wide scale, one that does not require multiple sequence alignments and works equally well for the detection of local and long-range base-pairings. Using an enhanced method that detects base-pairings at all possible combinations of splice sites within each gene, we now report RNA structures that could be involved in the regulation of splicing in mammals. Statistically, we demonstrate strong association between the occurrence of conserved RNA structures and alternative splicing, where local RNA structures are generally more frequent at alternative donor splice sites, while long-range structures are more associated with weak alternative acceptor splice sites. As an example, we validated the RNA structure in the human SF1 gene using minigenes in the HEK293 cell line. Point mutations that disrupted the base-pairing of two complementary boxes between exons 9 and 10 of this gene altered the splicing pattern, while the compensatory mutations that reestablished the base-pairing reverted splicing to that of the wild-type. There is statistical evidence for a Dscam-like class of mammalian genes, in which mutually exclusive RNA structures control mutually exclusive alternative splicing. In sum, we propose that long-range base-pairings carry an important, yet unconsidered part of the splicing code, and that, even by modest estimates, there must be thousands of such potentially regulatory structures conserved throughout the evolutionary history of mammals.
前体 mRNA 结构影响许多细胞过程,包括与疾病相关基因的剪接。RNA 结构预测的当代范式偏向于发生在前体 mRNA 短距离内的二级结构,尽管长程碱基配对至少同样重要。最近,我们开发了一种在全基因组范围内检测保守 RNA 结构的有效方法,该方法不需要进行多序列比对,并且对于局部和长程碱基配对的检测同样有效。使用一种增强的方法,可以在每个基因的所有可能的剪接位点组合中检测碱基配对,我们现在报告了可能参与哺乳动物剪接调控的 RNA 结构。从统计学上,我们证明了保守 RNA 结构与选择性剪接之间存在很强的关联,其中局部 RNA 结构通常在选择性供体位点更为常见,而长程结构则与较弱的选择性受体位点更为相关。例如,我们使用 HEK293 细胞系中的 minigenes,验证了人类 SF1 基因中的 RNA 结构。破坏该基因外显子 9 和 10 之间两个互补框碱基配对的点突变改变了剪接模式,而重新建立碱基配对的补偿性突变则使剪接恢复为野生型。有统计证据表明,在哺乳动物中存在 Dscam 样类基因,其中相互排斥的 RNA 结构控制相互排斥的选择性剪接。总之,我们提出长程碱基配对携带了剪接密码的一个重要但未被考虑的部分,并且即使是适度的估计,在哺乳动物的进化历史中也必须有数千个这样的潜在调节结构得以保守。