Quantitative Medicine and Systems Biology Division, Translational Genomics Research Institute, Phoenix, AZ, USA.
Collaborative Center for Translational Mass Spectrometry, Translational Genomics Research Institute, Phoenix, AZ, USA.
Sci Rep. 2021 May 24;11(1):10740. doi: 10.1038/s41598-021-89938-2.
The robust detection of disease-associated splice events from RNAseq data is challenging due to the potential confounding effect of gene expression levels and the often limited number of patients with relevant RNAseq data. Here we present a novel statistical approach to splicing outlier detection and differential splicing analysis. Our approach tests for differences in the percentages of sequence reads representing local splice events. We describe a software package called Bisbee which can predict the protein-level effect of splice alterations, a key feature lacking in many other splicing analysis resources. We leverage Bisbee's prediction of protein level effects as a benchmark of its capabilities using matched sets of RNAseq and mass spectrometry data from normal tissues. Bisbee exhibits improved sensitivity and specificity over existing approaches and can be used to identify tissue-specific splice variants whose protein-level expression can be confirmed by mass spectrometry. We also applied Bisbee to assess evidence for a pathogenic splicing variant contributing to a rare disease and to identify tumor-specific splice isoforms associated with an oncogenic mutation. Bisbee was able to rediscover previously validated results in both of these cases and also identify common tumor-associated splice isoforms replicated in two independent melanoma datasets.
由于基因表达水平的潜在混杂效应以及相关 RNAseq 数据通常数量有限,因此从 RNAseq 数据中稳健地检测与疾病相关的剪接事件具有挑战性。在这里,我们提出了一种新的统计方法来进行剪接异常检测和差异剪接分析。我们的方法测试代表局部剪接事件的序列读取百分比的差异。我们描述了一个名为 Bisbee 的软件包,它可以预测剪接改变的蛋白质水平效应,这是许多其他剪接分析资源所缺乏的关键特征。我们利用 Bisbee 对蛋白质水平效应的预测作为其使用来自正常组织的 RNAseq 和质谱数据的匹配集的能力的基准。Bisbee 在灵敏度和特异性方面均优于现有方法,可用于识别其蛋白质水平表达可通过质谱证实的组织特异性剪接变体。我们还应用 Bisbee 来评估导致罕见疾病的致病性剪接变体的证据,并鉴定与致癌突变相关的肿瘤特异性剪接异构体。在这两种情况下,Bisbee 都能够重新发现先前验证的结果,并在两个独立的黑色素瘤数据集复制常见的肿瘤相关剪接异构体。