Bioinformatics Interdepartmental Program, University of California, Los Angeles, California 90095, USA.
Department of Bioengineering, University of California, Los Angeles, California 90095, USA.
Genome Res. 2021 Mar;31(3):359-371. doi: 10.1101/gr.265637.120. Epub 2021 Jan 15.
Alternative splicing is an RNA processing mechanism that affects most genes in human, contributing to disease mechanisms and phenotypic diversity. The regulation of splicing involves an intricate network of -regulatory elements and -acting factors. Due to their high sequence specificity, -regulation of splicing can be altered by genetic variants, significantly affecting splicing outcomes. Recently, multiple methods have been applied to understanding the regulatory effects of genetic variants on splicing. However, it is still challenging to go beyond apparent association to pinpoint functional variants. To fill in this gap, we utilized large-scale data sets of the Genotype-Tissue Expression (GTEx) project to study genetically modulated alternative splicing (GMAS) via identification of allele-specific splicing events. We demonstrate that GMAS events are shared across tissues and individuals more often than expected by chance, consistent with their genetically driven nature. Moreover, although the allelic bias of GMAS exons varies across samples, the degree of variation is similar across tissues versus individuals. Thus, genetic background drives the GMAS pattern to a similar degree as tissue-specific splicing mechanisms. Leveraging the genetically driven nature of GMAS, we developed a new method to predict functional splicing-altering variants, built upon a genotype-phenotype concordance model across samples. Complemented by experimental validations, this method predicted >1000 functional variants, many of which may alter RNA-protein interactions. Lastly, 72% of GMAS-associated SNPs were in linkage disequilibrium with GWAS-reported SNPs, and such association was enriched in tissues of relevance for specific traits/diseases. Our study enables a comprehensive view of genetically driven splicing variations in human tissues.
可变剪接是一种影响人类大多数基因的 RNA 加工机制,有助于疾病机制和表型多样性。剪接的调控涉及到复杂的调控元件和作用因子网络。由于其序列特异性高,剪接的调控可以被遗传变异所改变,从而显著影响剪接结果。最近,已经应用了多种方法来理解遗传变异对剪接的调控作用。然而,要超越明显的关联,准确指出功能变异仍然具有挑战性。为了填补这一空白,我们利用大规模的基因型组织表达(GTEx)项目数据集,通过鉴定等位基因特异性剪接事件来研究遗传调控的可变剪接(GMAS)。我们证明 GMAS 事件在组织和个体之间的共享比随机预期更频繁,这与它们的遗传驱动性质一致。此外,尽管 GMAS 外显子的等位基因偏倚在不同样本中有所不同,但在组织与个体之间的变化程度相似。因此,遗传背景对 GMAS 模式的驱动程度与组织特异性剪接机制相似。利用 GMAS 的遗传驱动性质,我们开发了一种新的方法来预测功能性剪接改变变体,该方法建立在跨样本的基因型-表型一致性模型基础上。通过实验验证进行补充,该方法预测了 >1000 个功能性变体,其中许多可能改变 RNA-蛋白质相互作用。最后,GMAS 相关 SNP 中有 72%与 GWAS 报告 SNP 连锁不平衡,这种关联在与特定特征/疾病相关的组织中富集。我们的研究使人们能够全面了解人类组织中遗传驱动的剪接变异。