• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CI-SpliceAI-利用已注释的可变剪接位点来改进疾病相关剪接变异体的机器学习预测。

CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites.

机构信息

School of Human Development and Health, Faculty of Medicine, University of Southampton, Hampshire, United Kingdom.

Vision, Learning and Control, Department of Electronics and Computer Science, Faculty of Engineering and Physical Sciences, University of Southampton, Hampshire, United Kingdom.

出版信息

PLoS One. 2022 Jun 3;17(6):e0269159. doi: 10.1371/journal.pone.0269159. eCollection 2022.

DOI:10.1371/journal.pone.0269159
PMID:35657932
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9165884/
Abstract

BACKGROUND

It is estimated that up to 50% of all disease causing variants disrupt splicing. Due to its complexity, our ability to predict which variants disrupt splicing is limited, meaning missed diagnoses for patients. The emergence of machine learning for targeted medicine holds great potential to improve prediction of splice disrupting variants. The recently published SpliceAI algorithm utilises deep neural networks and has been reported to have a greater accuracy than other commonly used methods.

METHODS AND FINDINGS

The original SpliceAI was trained on splice sites included in primary isoforms combined with novel junctions observed in GTEx data, which might introduce noise and de-correlate the machine learning input with its output. Limiting the data to only validated and manual annotated primary and alternatively spliced GENCODE sites in training may improve predictive abilities. All of these gene isoforms were collapsed (aggregated into one pseudo-isoform) and the SpliceAI architecture was retrained (CI-SpliceAI). Predictive performance on a newly curated dataset of 1,316 functionally validated variants from the literature was compared with the original SpliceAI, alongside MMSplice, MaxEntScan, and SQUIRLS. Both SpliceAI algorithms outperformed the other methods, with the original SpliceAI achieving an accuracy of ∼91%, and CI-SpliceAI showing an improvement at ∼92% overall. Predictive accuracy increased in the majority of curated variants.

CONCLUSIONS

We show that including only manually annotated alternatively spliced sites in training data improves prediction of clinically relevant variants, and highlight avenues for further performance improvements.

摘要

背景

据估计,多达 50%的致病变异会破坏剪接。由于其复杂性,我们预测哪些变异会破坏剪接的能力有限,这意味着患者的诊断被遗漏。机器学习在靶向药物中的出现具有极大的潜力来提高对剪接破坏变异的预测能力。最近发表的 SpliceAI 算法利用深度神经网络,据报道其准确性高于其他常用方法。

方法和发现

原始的 SpliceAI 是在主要异构体中包含的剪接位点上进行训练的,同时结合了在 GTEx 数据中观察到的新型连接,这可能会引入噪声,并使机器学习的输入与其输出脱钩。在训练中仅限制使用经过验证和手动注释的主要和选择性剪接 GENCODE 位点的数据,可能会提高预测能力。所有这些基因异构体都被合并(聚合为一个伪异构体),并重新训练 SpliceAI 架构(CI-SpliceAI)。在新整理的文献中 1316 个功能验证变异的数据集上,与原始的 SpliceAI 以及 MMSplice、MaxEntScan 和 SQUIRLS 进行了预测性能比较。原始的 SpliceAI 和两种 SpliceAI 算法的表现都优于其他方法,原始的 SpliceAI 的准确率约为 91%,CI-SpliceAI 的总体准确率约为 92%。在大多数整理的变体中,预测准确性都有所提高。

结论

我们表明,在训练数据中仅包含手动注释的选择性剪接位点可提高对临床相关变异的预测能力,并强调了进一步提高性能的途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/0b252e671c96/pone.0269159.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/688f1ae676a6/pone.0269159.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/653cb7793ed7/pone.0269159.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/0565557a44c9/pone.0269159.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/aa4790696c09/pone.0269159.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/bbc7c71b6a5d/pone.0269159.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/a375f107c76e/pone.0269159.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/0b252e671c96/pone.0269159.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/688f1ae676a6/pone.0269159.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/653cb7793ed7/pone.0269159.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/0565557a44c9/pone.0269159.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/aa4790696c09/pone.0269159.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/bbc7c71b6a5d/pone.0269159.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/a375f107c76e/pone.0269159.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9a/9165884/0b252e671c96/pone.0269159.g007.jpg

相似文献

1
CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites.CI-SpliceAI-利用已注释的可变剪接位点来改进疾病相关剪接变异体的机器学习预测。
PLoS One. 2022 Jun 3;17(6):e0269159. doi: 10.1371/journal.pone.0269159. eCollection 2022.
2
Benchmarking deep learning splice prediction tools using functional splice assays.使用功能剪接测定法对深度学习剪接预测工具进行基准测试。
Hum Mutat. 2021 Jul;42(7):799-810. doi: 10.1002/humu.24212. Epub 2021 May 20.
3
Performance evaluation of computational methods for splice-disrupting variants and improving the performance using the machine learning-based framework.基于机器学习框架的剪接破坏变异计算方法性能评估及改进
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac334.
4
Combining full-length gene assay and SpliceAI to interpret the splicing impact of all possible SPINK1 coding variants.结合全长基因检测和 SpliceAI 来解读所有可能的 SPINK1 编码变异对剪接的影响。
Hum Genomics. 2024 Feb 27;18(1):21. doi: 10.1186/s40246-024-00586-9.
5
A validated heart-specific model for splice-disrupting variants in childhood heart disease.用于儿童心脏病中剪接破坏变异的经验证的心脏特异性模型。
Genome Med. 2024 Oct 15;16(1):119. doi: 10.1186/s13073-024-01383-8.
6
Performance Evaluation of SpliceAI for the Prediction of Splicing of Variants.SpliceAI 预测变异剪接的性能评估。
Genes (Basel). 2021 Aug 25;12(9):1308. doi: 10.3390/genes12091308.
7
Comparison of Tools for Splice-Altering Variant Prediction Using Established Spliceogenic Variants: An End-User's Point of View.使用已确定的剪接变异体进行剪接改变变异预测工具的比较:终端用户视角
Int J Genomics. 2022 Oct 13;2022:5265686. doi: 10.1155/2022/5265686. eCollection 2022.
8
SPiP: Splicing Prediction Pipeline, a machine learning tool for massive detection of exonic and intronic variant effects on mRNA splicing.SPiP:剪接预测管道,一种用于大规模检测外显子和内含子变异对 mRNA 剪接影响的机器学习工具。
Hum Mutat. 2022 Dec;43(12):2308-2323. doi: 10.1002/humu.24491. Epub 2022 Nov 20.
9
Splam: a deep-learning-based splice site predictor that improves spliced alignments.Splam:一种基于深度学习的剪接位点预测器,可提高剪接对齐。
Genome Biol. 2024 Sep 16;25(1):243. doi: 10.1186/s13059-024-03379-4.
10
Benchmarking splice variant prediction algorithms using massively parallel splicing assays.基于大规模并行拼接分析的剪接变异预测算法的基准测试。
Genome Biol. 2023 Dec 21;24(1):294. doi: 10.1186/s13059-023-03144-z.

引用本文的文献

1
Clinical utility of DNA-methylation signatures in routine diagnostics for neurodevelopmental disorders.DNA甲基化特征在神经发育障碍常规诊断中的临床应用
Eur J Hum Genet. 2025 Jul 29. doi: 10.1038/s41431-025-01919-5.
2
Integrating Artificial Intelligence in Next-Generation Sequencing: Advances, Challenges, and Future Directions.将人工智能整合到下一代测序中:进展、挑战与未来方向。
Curr Issues Mol Biol. 2025 Jun 19;47(6):470. doi: 10.3390/cimb47060470.
3
Research advancements in the Use of artificial intelligence for prenatal diagnosis of neural tube defects.

本文引用的文献

1
Comparison of in silico strategies to prioritize rare genomic variants impacting RNA splicing for the diagnosis of genomic disorders.比较基于计算机的策略,以确定影响 RNA 剪接的罕见基因组变异,用于基因组疾病的诊断。
Sci Rep. 2021 Oct 18;11(1):20607. doi: 10.1038/s41598-021-99747-2.
2
Interpretable prioritization of splice variants in diagnostic next-generation sequencing.可解释的剪接变异体优先排序在诊断下一代测序中。
Am J Hum Genet. 2021 Sep 2;108(9):1564-1577. doi: 10.1016/j.ajhg.2021.06.014. Epub 2021 Jul 21.
3
Spectrum of splicing variants in disease genes and the ability of RNA analysis to reduce uncertainty in clinical interpretation.
人工智能在神经管缺陷产前诊断中的应用研究进展
Front Pediatr. 2025 Apr 17;13:1514447. doi: 10.3389/fped.2025.1514447. eCollection 2025.
4
Role of artificial intelligence in advancing immunology.人工智能在推动免疫学发展中的作用。
Immunol Res. 2025 Apr 24;73(1):76. doi: 10.1007/s12026-025-09632-7.
5
Detection of mRNA Transcript Variants.mRNA转录变体的检测
Genes (Basel). 2025 Mar 16;16(3):343. doi: 10.3390/genes16030343.
6
Toward a comprehensive profiling of alternative splicing proteoform structures, interactions and functions.迈向对可变剪接蛋白质异构体的结构、相互作用和功能进行全面分析。
Curr Opin Struct Biol. 2025 Feb;90:102979. doi: 10.1016/j.sbi.2024.102979. Epub 2025 Jan 7.
7
Transformers significantly improve splice site prediction.变压器显著提高了剪接位点预测能力。
Commun Biol. 2024 Dec 4;7(1):1616. doi: 10.1038/s42003-024-07298-9.
8
Homozygous variants in WDR83OS lead to a neurodevelopmental disorder with hypercholanemia.WDR83OS 中的纯合变异导致伴有高胆汁酸血症的神经发育障碍。
Am J Hum Genet. 2024 Nov 7;111(11):2566-2581. doi: 10.1016/j.ajhg.2024.10.002. Epub 2024 Oct 28.
9
Germline Variants in Patients Affected by Both Uveal and Cutaneous Melanoma.葡萄膜黑色素瘤和皮肤黑色素瘤患者的种系变异
Pigment Cell Melanoma Res. 2025 Jan;38(1):e13199. doi: 10.1111/pcmr.13199. Epub 2024 Sep 24.
10
Functional characterization of 2,832 JAG1 variants supports reclassification for Alagille syndrome and improves guidance for clinical variant interpretation.对 2,832 种 JAG1 变异体的功能特征进行分析,支持对 Alagille 综合征进行重新分类,并为临床变异体解读提供更好的指导。
Am J Hum Genet. 2024 Aug 8;111(8):1656-1672. doi: 10.1016/j.ajhg.2024.06.011. Epub 2024 Jul 22.
疾病基因中剪接变异的谱与 RNA 分析减少临床解读不确定性的能力。
Am J Hum Genet. 2021 Apr 1;108(4):696-708. doi: 10.1016/j.ajhg.2021.03.006. Epub 2021 Mar 19.
4
GENCODE 2021.GENCODE 2021.
Nucleic Acids Res. 2021 Jan 8;49(D1):D916-D923. doi: 10.1093/nar/gkaa1087.
5
Analysis of transcript-deleterious variants in Mendelian disorders: implications for RNA-based diagnostics.分析孟德尔疾病中的转录有害变异:对基于 RNA 的诊断的影响。
Genome Biol. 2020 Jun 17;21(1):145. doi: 10.1186/s13059-020-02053-9.
6
Blood RNA analysis can increase clinical diagnostic rate and resolve variants of uncertain significance.血液 RNA 分析可以提高临床诊断率,并解决意义不确定的变异。
Genet Med. 2020 Jun;22(6):1005-1014. doi: 10.1038/s41436-020-0766-9. Epub 2020 Mar 3.
7
Machine Learning Approaches for the Prioritization of Genomic Variants Impacting Pre-mRNA Splicing.机器学习方法在预测影响前体 mRNA 剪接的基因组变异中的应用。
Cells. 2019 Nov 26;8(12):1513. doi: 10.3390/cells8121513.
8
Diagnostic utility of transcriptome sequencing for rare Mendelian diseases.转录组测序对罕见孟德尔疾病的诊断效用。
Genet Med. 2020 Mar;22(3):490-499. doi: 10.1038/s41436-019-0672-1. Epub 2019 Oct 14.
9
Computational analysis of functional SNPs in Alzheimer's disease-associated endocytosis genes.阿尔茨海默病相关内吞作用基因中功能性单核苷酸多态性的计算分析
PeerJ. 2019 Sep 30;7:e7667. doi: 10.7717/peerj.7667. eCollection 2019.
10
The Kipoi repository accelerates community exchange and reuse of predictive models for genomics.Kipoi库加速了基因组学预测模型的社区交流与重用。
Nat Biotechnol. 2019 Jun;37(6):592-600. doi: 10.1038/s41587-019-0140-0.