Mercer Tim R, Clark Michael B, Andersen Stacey B, Brunck Marion E, Haerty Wilfried, Crawford Joanna, Taft Ryan J, Nielsen Lars K, Dinger Marcel E, Mattick John S
Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia; St. Vincent's Clinical School, Faculty of Medicine, UNSW Australia, Sydney, New South Wales 2052, Australia;
Garvan Institute of Medical Research, Sydney, New South Wales 2010, Australia; MRC Functional Genomics Unit, Department of Physiology, Anatomy, and Genetics, University of Oxford, Oxford OX1 3PT, United Kingdom;
Genome Res. 2015 Feb;25(2):290-303. doi: 10.1101/gr.182899.114. Epub 2015 Jan 5.
During the splicing reaction, the 5' intron end is joined to the branchpoint nucleotide, selecting the next exon to incorporate into the mature RNA and forming an intron lariat, which is excised. Despite a critical role in gene splicing, the locations and features of human splicing branchpoints are largely unknown. We use exoribonuclease digestion and targeted RNA-sequencing to enrich for sequences that traverse the lariat junction and, by split and inverted alignment, reveal the branchpoint. We identify 59,359 high-confidence human branchpoints in >10,000 genes, providing a first map of splicing branchpoints in the human genome. Branchpoints are predominantly adenosine, highly conserved, and closely distributed to the 3' splice site. Analysis of human branchpoints reveals numerous novel features, including distinct features of branchpoints for alternatively spliced exons and a family of conserved sequence motifs overlapping branchpoints we term B-boxes, which exhibit maximal nucleotide diversity while maintaining interactions with the keto-rich U2 snRNA. Different B-box motifs exhibit divergent usage in vertebrate lineages and associate with other splicing elements and distinct intron-exon architectures, suggesting integration within a broader regulatory splicing code. Lastly, although branchpoints are refractory to common mutational processes and genetic variation, mutations occurring at branchpoint nucleotides are enriched for disease associations.
在剪接反应过程中,5' 内含子末端与分支点核苷酸相连,选择下一个外显子纳入成熟RNA并形成内含子套索,然后将其切除。尽管分支点在基因剪接中起关键作用,但人类剪接分支点的位置和特征在很大程度上尚不清楚。我们使用外切核糖核酸酶消化和靶向RNA测序来富集穿越套索连接点的序列,并通过拆分和反向比对来揭示分支点。我们在一万多个基因中鉴定出59359个高可信度的人类分支点,提供了人类基因组中剪接分支点的首张图谱。分支点主要为腺苷,高度保守,且紧密分布于3' 剪接位点。对人类分支点的分析揭示了许多新特征,包括可变剪接外显子分支点的独特特征,以及一类与分支点重叠的保守序列基序,我们将其称为B盒,它们在保持与富含酮基的U2小核RNA相互作用的同时展现出最大的核苷酸多样性。不同的B盒基序在脊椎动物谱系中表现出不同的使用情况,并与其他剪接元件及不同的内含子 - 外显子结构相关联,表明其整合于更广泛的调控剪接密码中。最后,尽管分支点对常见的突变过程和遗传变异具有抗性,但发生在分支点核苷酸上的突变却富集了疾病关联性。