• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

变压器显著提高了剪接位点预测能力。

Transformers significantly improve splice site prediction.

作者信息

Jónsson Benedikt A, Halldórsson Gísli H, Árdal Steinþór, Rögnvaldsson Sölvi, Einarsson Eyþór, Sulem Patrick, Guðbjartsson Daníel F, Melsted Páll, Stefánsson Kári, Úlfarsson Magnús Ö

机构信息

deCODE Genetics/Amgen Inc., Reykjavik, Iceland.

University of Iceland, Reykjavik, Iceland.

出版信息

Commun Biol. 2024 Dec 4;7(1):1616. doi: 10.1038/s42003-024-07298-9.

DOI:10.1038/s42003-024-07298-9
PMID:39633146
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11618611/
Abstract

Mutations that affect RNA splicing significantly impact human diversity and disease. Here we present a method using transformers, a type of machine learning model, to detect splicing from raw 45,000-nucleotide sequences. We generate embeddings with residual neural networks and apply hard attention to select splice site candidates, enabling efficient training on long sequences. Our method surpasses the leading tool, SpliceAI, in detecting splice sites in GENCODE and ENSEMBL annotations. Using extensive RNA sequencing data from an Icelandic cohort of 17,848 individuals and the Genotype-Tissue Expression (GTEx) project, our method demonstrates superior performance in detecting splice junctions compared to SpliceAI-10k (PR-AUC = 0.834 vs. PR-AUC = 0.820) and is more effective at identifying disease-related splice variants in ClinVar (PR-AUC = 0.997 vs. PR-AUC = 0.996). These advancements hold promise for improving genetic research and clinical diagnostics, potentially leading to better understanding and treatment of splicing-related diseases.

摘要

影响RNA剪接的突变对人类多样性和疾病有重大影响。在此,我们提出一种使用变压器(一种机器学习模型)从45000个核苷酸的原始序列中检测剪接的方法。我们用残差神经网络生成嵌入,并应用硬注意力来选择剪接位点候选,从而能够对长序列进行高效训练。我们的方法在检测GENCODE和ENSEMBL注释中的剪接位点方面超越了领先工具SpliceAI。利用来自冰岛17848名个体队列的大量RNA测序数据以及基因型-组织表达(GTEx)项目,我们的方法在检测剪接连接方面比SpliceAI-10k表现更优(PR-AUC = 0.834 vs. PR-AUC = 0.820),并且在ClinVar中识别与疾病相关的剪接变体方面更有效(PR-AUC = 0.997 vs. PR-AUC = 0.996)。这些进展有望改善基因研究和临床诊断,可能有助于更好地理解和治疗与剪接相关的疾病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/62da7c412195/42003_2024_7298_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/a99070cd342c/42003_2024_7298_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/07dff4bf319b/42003_2024_7298_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/62da7c412195/42003_2024_7298_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/a99070cd342c/42003_2024_7298_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/07dff4bf319b/42003_2024_7298_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a14/11618611/62da7c412195/42003_2024_7298_Fig3_HTML.jpg

相似文献

1
Transformers significantly improve splice site prediction.变压器显著提高了剪接位点预测能力。
Commun Biol. 2024 Dec 4;7(1):1616. doi: 10.1038/s42003-024-07298-9.
2
CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites.CI-SpliceAI-利用已注释的可变剪接位点来改进疾病相关剪接变异体的机器学习预测。
PLoS One. 2022 Jun 3;17(6):e0269159. doi: 10.1371/journal.pone.0269159. eCollection 2022.
3
Performance evaluation of computational methods for splice-disrupting variants and improving the performance using the machine learning-based framework.基于机器学习框架的剪接破坏变异计算方法性能评估及改进
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac334.
4
SpliceFinder: ab initio prediction of splice sites using convolutional neural network.SpliceFinder:使用卷积神经网络进行剪接位点的从头预测。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):652. doi: 10.1186/s12859-019-3306-3.
5
Evaluating the performance of sequence encoding schemes and machine learning methods for splice sites recognition.评估序列编码方案和机器学习方法在剪接位点识别中的性能。
Gene. 2019 Jul 15;705:113-126. doi: 10.1016/j.gene.2019.04.047. Epub 2019 Apr 19.
6
A validated heart-specific model for splice-disrupting variants in childhood heart disease.用于儿童心脏病中剪接破坏变异的经验证的心脏特异性模型。
Genome Med. 2024 Oct 15;16(1):119. doi: 10.1186/s13073-024-01383-8.
7
Splam: a deep-learning-based splice site predictor that improves spliced alignments.Splam:一种基于深度学习的剪接位点预测器,可提高剪接对齐。
Genome Biol. 2024 Sep 16;25(1):243. doi: 10.1186/s13059-024-03379-4.
8
2passtools: two-pass alignment using machine-learning-filtered splice junctions increases the accuracy of intron detection in long-read RNA sequencing.2passtools:使用机器学习过滤的剪接接头的双通比对提高了长读 RNA 测序中内含子检测的准确性。
Genome Biol. 2021 Mar 1;22(1):72. doi: 10.1186/s13059-021-02296-0.
9
Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive.人类剪接多样性以及序列读取存档中人类RNA测序样本间未注释剪接位点的程度。
Genome Biol. 2016 Dec 30;17(1):266. doi: 10.1186/s13059-016-1118-6.
10
Combining genetic constraint with predictions of alternative splicing to prioritize deleterious splicing in rare disease studies.结合遗传约束和选择性剪接预测,优先考虑罕见病研究中的有害剪接。
BMC Bioinformatics. 2022 Nov 14;23(1):482. doi: 10.1186/s12859-022-05041-x.

引用本文的文献

1
Detection of mRNA Transcript Variants.mRNA转录变体的检测
Genes (Basel). 2025 Mar 16;16(3):343. doi: 10.3390/genes16030343.

本文引用的文献

1
Nucleotide Transformer: building and evaluating robust foundation models for human genomics.核苷酸变换器:构建和评估用于人类基因组学的强大基础模型。
Nat Methods. 2025 Feb;22(2):287-297. doi: 10.1038/s41592-024-02523-z. Epub 2024 Nov 28.
2
Aberrant splicing prediction across human tissues.跨人类组织的异常剪接预测
Nat Genet. 2023 May;55(5):861-870. doi: 10.1038/s41588-023-01373-3. Epub 2023 May 4.
3
SpliceVault predicts the precise nature of variant-associated mis-splicing.SpliceVault 预测了变体相关的错误剪接的确切性质。
Nat Genet. 2023 Feb;55(2):324-332. doi: 10.1038/s41588-022-01293-8. Epub 2023 Feb 6.
4
Recommendations for clinical interpretation of variants found in non-coding regions of the genome.推荐对基因组非编码区域中发现的变异进行临床解读。
Genome Med. 2022 Jul 19;14(1):73. doi: 10.1186/s13073-022-01073-3.
5
CI-SpliceAI-Improving machine learning predictions of disease causing splicing variants using curated alternative splice sites.CI-SpliceAI-利用已注释的可变剪接位点来改进疾病相关剪接变异体的机器学习预测。
PLoS One. 2022 Jun 3;17(6):e0269159. doi: 10.1371/journal.pone.0269159. eCollection 2022.
6
Predicting RNA splicing from DNA sequence using Pangolin.使用 Pangolin 从 DNA 序列预测 RNA 剪接。
Genome Biol. 2022 Apr 21;23(1):103. doi: 10.1186/s13059-022-02664-4.
7
Comparison of in silico strategies to prioritize rare genomic variants impacting RNA splicing for the diagnosis of genomic disorders.比较基于计算机的策略,以确定影响 RNA 剪接的罕见基因组变异,用于基因组疾病的诊断。
Sci Rep. 2021 Oct 18;11(1):20607. doi: 10.1038/s41598-021-99747-2.
8
Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
9
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
10
CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores.使用深度学习衍生的剪接分数提高 CADD-Splice 全基因组变异效应预测。
Genome Med. 2021 Feb 22;13(1):31. doi: 10.1186/s13073-021-00835-9.