Population Health Program, QIMR Berghofer Medical Research Institute, Herston, QLD 4006, Australia.
Faculty of Medicine, The University of Queensland, Herston, QLD 4006, Australia.
Bioinformatics. 2023 Apr 3;39(4). doi: 10.1093/bioinformatics/btad179.
SpliceAI is a widely used splicing prediction tool and its most common application relies on the maximum delta score to assign variant impact on splicing. We developed the SpliceAI-10k calculator (SAI-10k-calc) to extend use of this tool to predict: the splicing aberration type including pseudoexonization, intron retention, partial exon deletion, and (multi)exon skipping using a 10 kb analysis window; the size of inserted or deleted sequence; the effect on reading frame; and the altered amino acid sequence. SAI-10k-calc has 95% sensitivity and 96% specificity for predicting variants that impact splicing, computed from a control dataset of 1212 single-nucleotide variants (SNVs) with curated splicing assay results. Notably, it has high performance (≥84% accuracy) for predicting pseudoexon and partial intron retention. The automated amino acid sequence prediction allows for efficient identification of variants that are expected to result in mRNA nonsense-mediated decay or translation of truncated proteins.
SAI-10k-calc is implemented in R (https://github.com/adavi4/SAI-10k-calc) and also available as a Microsoft Excel spreadsheet. Users can adjust the default thresholds to suit their target performance values.
SpliceAI 是一种广泛使用的剪接预测工具,其最常见的应用依赖于最大差值评分来确定剪接变异的影响。我们开发了 SpliceAI-10k 计算器 (SAI-10k-calc),以扩展该工具的使用范围,用于预测:使用 10kb 分析窗口的剪接异常类型,包括假外显子化、内含子保留、部分外显子缺失和(多)外显子跳跃;插入或缺失序列的大小;对阅读框的影响;以及改变的氨基酸序列。从经过剪接分析验证的 1212 个单核苷酸变异 (SNV) 的对照数据集计算得出,SAI-10k-calc 对预测影响剪接的变异的敏感性为 95%,特异性为 96%。值得注意的是,它在预测假外显子和部分内含子保留方面具有很高的性能(≥84%的准确率)。自动的氨基酸序列预测允许有效地识别预计导致 mRNA 无意义介导衰变或翻译截断蛋白的变异。
SAI-10k-calc 是用 R 语言实现的(https://github.com/adavi4/SAI-10k-calc),也可以作为 Microsoft Excel 电子表格使用。用户可以调整默认阈值以适应其目标性能值。