Jang Woori, Park Joonhong, Chae Hyojin, Kim Myungshin
Department of Laboratory Medicine, College of Medicine, Inha University, Incheon, Republic of Korea.
Department of Laboratory Medicine, Jeonbuk National University Medical School and Hospital, Jeonju, Republic of Korea.
Int J Genomics. 2022 Oct 13;2022:5265686. doi: 10.1155/2022/5265686. eCollection 2022.
Assessing the impact of variants of unknown significance on splicing has become a critical issue and a bottleneck, especially with the widespread implementation of whole-genome or exome sequencing. Although multiple tools are available, the interpretation and application of these tools are difficult and practical guidelines are still lacking. A streamlined decision-making process can facilitate the downstream RNA analysis in a more efficient manner. Therefore, we evaluated the performance of 8 tools (Splice Site Finder, MaxEntScan, Splice-site prediction by neural network, GeneSplicer, Human Splicing Finder, SpliceAI, Splicing Predictions in Consensus Elements, and SpliceRover) using 114 spliceogenic variants, experimentally validated at the mRNA level. The change in the predicted score incurred by the variant of the nearest wild-type splice site was analyzed, and for type II, III, and IV splice variants, the change in the prediction score of or cryptic splice site was also analyzed. SpliceAI and SpliceRover, tools based on deep learning, outperformed all other tools, with AUCs of 0.972 and 0.924, respectively. For and cryptic splice sites, SpliceAI outperformed all other tools and showed a sensitivity of 95.7% at an optimal cut-off of 0.02 score change. Our results show that deep learning algorithms, especially those of SpliceAI, are validated at a significantly higher rate than other tools for clinically relevant variants. This suggests that deep learning algorithms outperform traditional probabilistic approaches and classical machine learning tools in predicting the and cryptic splice sites.
评估意义未明的变异对剪接的影响已成为一个关键问题和瓶颈,尤其是在全基因组或外显子组测序广泛应用的情况下。尽管有多种工具可用,但这些工具的解释和应用都很困难,而且仍然缺乏实用指南。简化的决策过程可以更高效地促进下游RNA分析。因此,我们使用114个在mRNA水平经过实验验证的剪接变异,评估了8种工具(剪接位点查找器、最大熵扫描、神经网络剪接位点预测、基因剪接器、人类剪接查找器、剪接AI、共有元件中的剪接预测和剪接漫游者)的性能。分析了最接近野生型剪接位点的变异导致的预测分数变化,对于II型、III型和IV型剪接变异,还分析了隐蔽剪接位点预测分数的变化。基于深度学习的工具剪接AI和剪接漫游者表现优于所有其他工具,曲线下面积分别为0.972和0.924。对于隐蔽剪接位点,剪接AI表现优于所有其他工具,在最佳分数变化截止值为0.02时,灵敏度为95.7%。我们的结果表明,深度学习算法,尤其是剪接AI的算法,对临床相关变异的验证率明显高于其他工具。这表明在预测隐蔽剪接位点方面,深度学习算法优于传统概率方法和经典机器学习工具。