Shibata Akihide, Okuno Tatsuya, Rahman Mohammad Alinoor, Azuma Yoshiteru, Takeda Jun-Ichi, Masuda Akio, Selcen Duygu, Engel Andrew G, Ohno Kinji
Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, Nagoya, Japan.
Department of Neurology, Mayo Clinic, Rochester, MN, USA.
J Hum Genet. 2016 Jul;61(7):633-40. doi: 10.1038/jhg.2016.23. Epub 2016 Mar 24.
Precise spatiotemporal regulation of splicing is mediated by splicing cis-elements on pre-mRNA. Single-nucleotide variations (SNVs) affecting intronic cis-elements possibly compromise splicing, but no efficient tool has been available to identify them. Following an effect-size analysis of each intronic nucleotide on annotated alternative splicing, we extracted 105 parameters that could affect the strength of the splicing signals. However, we could not generate reliable support vector regression models to predict the percent-splice-in (PSI) scores for normal human tissues. Next, we generated support vector machine (SVM) models using 110 parameters to directly differentiate pathogenic SNVs in the Human Gene Mutation Database and normal SNVs in the dbSNP database, and we obtained models with a sensitivity of 0.800±0.041 (mean and s.d.) and a specificity of 0.849±0.021. Our IntSplice models were more discriminating than SVM models that we generated with Shapiro-Senapathy score and MaxEntScan::score3ss. We applied IntSplice to a naturally occurring and nine artificial intronic mutations in RAPSN causing congenital myasthenic syndrome. IntSplice correctly predicted the splicing consequences for nine of the ten mutants. We created a web service program, IntSplice (http://www.med.nagoya-u.ac.jp/neurogenetics/IntSplice) to predict splicing-affecting SNVs at intronic positions from -50 to -3.
剪接的精确时空调控由前体mRNA上的剪接顺式元件介导。影响内含子顺式元件的单核苷酸变异(SNV)可能会损害剪接,但一直没有有效的工具来识别它们。在对注释的可变剪接中每个内含子核苷酸进行效应大小分析后,我们提取了105个可能影响剪接信号强度的参数。然而,我们无法生成可靠的支持向量回归模型来预测正常人体组织的剪接百分率(PSI)得分。接下来,我们使用110个参数生成支持向量机(SVM)模型,以直接区分人类基因突变数据库中的致病性SNV和dbSNP数据库中的正常SNV,我们获得的模型灵敏度为0.800±0.041(平均值和标准差),特异性为0.849±0.021。我们的IntSplice模型比我们用夏皮罗-塞纳帕蒂分数和MaxEntScan::score3ss生成的SVM模型更具区分性。我们将IntSplice应用于RAPSN中导致先天性肌无力综合征的一个自然发生的内含子突变和九个人工内含子突变。IntSplice正确预测了十个突变体中九个的剪接结果。我们创建了一个网络服务程序IntSplice(http://www.med.nagoya-u.ac.jp/neurogenetics/IntSplice),以预测内含子位置从-50到-3处影响剪接的SNV。