• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Spliceator:使用卷积神经网络进行多物种剪接位点预测。

Spliceator: multi-species splice site prediction using convolutional neural networks.

机构信息

Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000, Strasbourg, France.

BiGEst-ICube Platform, ICube Laboratory, UMR7357, 1 rue Eugène Boeckel, 67000, Strasbourg, France.

出版信息

BMC Bioinformatics. 2021 Nov 23;22(1):561. doi: 10.1186/s12859-021-04471-3.

DOI:10.1186/s12859-021-04471-3
PMID:34814826
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8609763/
Abstract

BACKGROUND

Ab initio prediction of splice sites is an essential step in eukaryotic genome annotation. Recent predictors have exploited Deep Learning algorithms and reliable gene structures from model organisms. However, Deep Learning methods for non-model organisms are lacking.

RESULTS

We developed Spliceator to predict splice sites in a wide range of species, including model and non-model organisms. Spliceator uses a convolutional neural network and is trained on carefully validated data from over 100 organisms. We show that Spliceator achieves consistently high accuracy (89-92%) compared to existing methods on independent benchmarks from human, fish, fly, worm, plant and protist organisms.

CONCLUSIONS

Spliceator is a new Deep Learning method trained on high-quality data, which can be used to predict splice sites in diverse organisms, ranging from human to protists, with consistently high accuracy.

摘要

背景

从头预测剪接位点是真核基因组注释的一个重要步骤。最近的预测器利用了深度学习算法和来自模式生物的可靠基因结构。然而,缺乏针对非模式生物的深度学习方法。

结果

我们开发了 Spliceator,以预测包括模型和非模型生物在内的广泛物种中的剪接位点。Spliceator 使用卷积神经网络,并在来自 100 多种生物的经过精心验证的数据上进行训练。我们表明,与来自人类、鱼类、苍蝇、蠕虫、植物和原生生物的独立基准相比,Spliceator 在剪接位点的预测上始终具有很高的准确性(89-92%)。

结论

Spliceator 是一种基于高质量数据的新的深度学习方法,可以用于预测从人类到原生生物等各种生物中的剪接位点,具有始终如一的高准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/6b9f457032a3/12859_2021_4471_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/37aec50ce6df/12859_2021_4471_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/c7e852c7d0c5/12859_2021_4471_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/039168bc693b/12859_2021_4471_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/0d782bf98df7/12859_2021_4471_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/f32f7b8844ea/12859_2021_4471_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/122178e2249e/12859_2021_4471_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/902bb58fa18d/12859_2021_4471_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/3ab6fc602192/12859_2021_4471_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/65af612e9542/12859_2021_4471_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/59dac3a75cec/12859_2021_4471_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/7c5da801827a/12859_2021_4471_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/d7c11218e058/12859_2021_4471_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/6b9f457032a3/12859_2021_4471_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/37aec50ce6df/12859_2021_4471_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/c7e852c7d0c5/12859_2021_4471_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/039168bc693b/12859_2021_4471_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/0d782bf98df7/12859_2021_4471_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/f32f7b8844ea/12859_2021_4471_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/122178e2249e/12859_2021_4471_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/902bb58fa18d/12859_2021_4471_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/3ab6fc602192/12859_2021_4471_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/65af612e9542/12859_2021_4471_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/59dac3a75cec/12859_2021_4471_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/7c5da801827a/12859_2021_4471_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/d7c11218e058/12859_2021_4471_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7656/8609763/6b9f457032a3/12859_2021_4471_Fig13_HTML.jpg

相似文献

1
Spliceator: multi-species splice site prediction using convolutional neural networks.Spliceator:使用卷积神经网络进行多物种剪接位点预测。
BMC Bioinformatics. 2021 Nov 23;22(1):561. doi: 10.1186/s12859-021-04471-3.
2
SpliceFinder: ab initio prediction of splice sites using convolutional neural network.SpliceFinder:使用卷积神经网络进行剪接位点的从头预测。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):652. doi: 10.1186/s12859-019-3306-3.
3
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene. 2020 Dec;763S:100035. doi: 10.1016/j.gene.2020.100035. Epub 2020 May 13.
4
Splice2Deep: An ensemble of deep convolutional neural networks for improved splice site prediction in genomic DNA.Splice2Deep:用于改进基因组DNA中剪接位点预测的深度卷积神经网络集成方法。
Gene X. 2020 May 13;5:100035. doi: 10.1016/j.gene.2020.100035. eCollection 2020 Dec.
5
CNNSplice: Robust models for splice site prediction using convolutional neural networks.CNNSplice:使用卷积神经网络进行剪接位点预测的稳健模型。
Comput Struct Biotechnol J. 2023 May 30;21:3210-3223. doi: 10.1016/j.csbj.2023.05.031. eCollection 2023.
6
EDeepSSP: Explainable deep neural networks for exact splice sites prediction.EDeepSSP:用于准确剪接位点预测的可解释深度神经网络。
J Bioinform Comput Biol. 2020 Aug;18(4):2050024. doi: 10.1142/S0219720020500249. Epub 2020 Jul 22.
7
DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks.DRANetSplicer:一种基于深度残差注意力网络的剪接位点预测模型。
Genes (Basel). 2024 Mar 26;15(4):404. doi: 10.3390/genes15040404.
8
Human Splice-Site Prediction with Deep Neural Networks.利用深度神经网络进行人类剪接位点预测
J Comput Biol. 2018 Aug;25(8):954-961. doi: 10.1089/cmb.2018.0041. Epub 2018 Apr 18.
9
EnsembleSplice: ensemble deep learning model for splice site prediction.EnsembleSplice:用于剪接位点预测的集成深度学习模型。
BMC Bioinformatics. 2022 Oct 6;23(1):413. doi: 10.1186/s12859-022-04971-w.
10
SpliceRover: interpretable convolutional neural networks for improved splice site prediction.SpliceRover:用于提高剪接位点预测的可解释卷积神经网络。
Bioinformatics. 2018 Dec 15;34(24):4180-4188. doi: 10.1093/bioinformatics/bty497.

引用本文的文献

1
Genetic Heterogeneity of Autism Spectrum Disorder: Identification of Five Novel Mutations (RIMS2, FOXG1, AUTS2, ZCCHC17, and SPTBN5) in Iranian Families via Whole-Exome and Whole-Genome Sequencing.自闭症谱系障碍的遗传异质性:通过全外显子组和全基因组测序在伊朗家庭中鉴定出五个新突变(RIMS2、FOXG1、AUTS2、ZCCHC17和SPTBN5)
Biochem Genet. 2025 Aug 16. doi: 10.1007/s10528-025-11226-9.
2
Investigation of growth traits in Turkish Merino lambs using multi-locus GWAS approaches: Karacabey Merino.使用多基因座全基因组关联研究方法对土耳其美利奴羔羊生长性状的调查:卡拉卡贝美利奴羊
BMC Vet Res. 2025 Aug 8;21(1):511. doi: 10.1186/s12917-025-04957-9.
3

本文引用的文献

1
Genome annotation across species using deep convolutional neural networks.使用深度卷积神经网络对跨物种的基因组进行注释。
PeerJ Comput Sci. 2020 Jun 15;6:e278. doi: 10.7717/peerj-cs.278. eCollection 2020.
2
BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database.BRAKER2:借助蛋白质数据库,由GeneMark-EP+和AUGUSTUS支持的真核生物基因组自动注释工具。
NAR Genom Bioinform. 2021 Jan 6;3(1):lqaa108. doi: 10.1093/nargab/lqaa108. eCollection 2021 Mar.
3
Helixer: cross-species gene annotation of large eukaryotic genomes using deep learning.
Genomic Characterization and Molecular Epidemiology of Tusaviruses and Related Novel Protoparvoviruses (Family ) from Ruminant Species (Bovine, Ovine and Caprine) in Hungary.
匈牙利反刍动物(牛、绵羊和山羊)中 Tusaviruses 及相关新型细小病毒属病毒(细小病毒科)的基因组特征与分子流行病学研究
Viruses. 2025 Jun 24;17(7):888. doi: 10.3390/v17070888.
4
Synthesis of large single-transcript pathways from oligonucleotide pools: Design of STARBURST, an autobioluminescent reporter.从寡核苷酸库合成大型单转录本途径:自发光报告基因STARBURST的设计
Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2508109122. doi: 10.1073/pnas.2508109122. Epub 2025 Jul 29.
5
RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks.RiNALMo:通用RNA语言模型在结构预测任务上能很好地泛化。
Nat Commun. 2025 Jul 1;16(1):5671. doi: 10.1038/s41467-025-60872-5.
6
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling.墨丘利神杖:双向等变远程DNA序列建模
Proc Mach Learn Res. 2024 Jul;235:43632-43648.
7
Predicting Protein Function in the AI and Big Data Era.人工智能与大数据时代的蛋白质功能预测
Biochemistry. 2025 Jun 3;64(11):2345-2352. doi: 10.1021/acs.biochem.5c00186. Epub 2025 May 17.
8
Genomic and Epidemiological Investigations Reveal Chromosomal Integration of the Acipenserid Herpesvirus 3 Genome in Lake Sturgeon .基因组学和流行病学调查揭示了湖鲟中鲟疱疹病毒3基因组的染色体整合情况。
Viruses. 2025 Apr 5;17(4):534. doi: 10.3390/v17040534.
9
Targeted long-read cDNA sequencing reveals novel splice-altering pathogenic variants causing retinal dystrophies.靶向长读长cDNA测序揭示了导致视网膜营养不良的新型剪接改变致病变异。
HGG Adv. 2025 Apr 18;6(3):100442. doi: 10.1016/j.xhgg.2025.100442.
10
SProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotes.SProtFP:一种基于机器学习的原核生物中小开放阅读框功能分类方法。
NAR Genom Bioinform. 2025 Jan 7;7(1):lqae186. doi: 10.1093/nargab/lqae186. eCollection 2025 Mar.
Helixer:利用深度学习对大型真核生物基因组进行跨物种基因注释。
Bioinformatics. 2021 Apr 1;36(22-23):5291-5298. doi: 10.1093/bioinformatics/btaa1044.
4
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
5
Understanding the causes of errors in eukaryotic protein-coding gene prediction: a case study of primate proteomes.理解真核生物蛋白质编码基因预测错误的原因:以灵长类蛋白质组为例。
BMC Bioinformatics. 2020 Nov 10;21(1):513. doi: 10.1186/s12859-020-03855-1.
6
A survey on deep learning in DNA/RNA motif mining.深度学习在 DNA/RNA 基序挖掘中的应用调查。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa229.
7
Incomplete annotation has a disproportionate impact on our understanding of Mendelian and complex neurogenetic disorders.不完整的注释对我们理解孟德尔和复杂神经遗传疾病有不成比例的影响。
Sci Adv. 2020 Jun 10;6(24). doi: 10.1126/sciadv.aay8299. Print 2020 Jun.
8
Deep learning models in genomics; are we there yet?基因组学中的深度学习模型;我们做到了吗?
Comput Struct Biotechnol J. 2020 Jun 17;18:1466-1473. doi: 10.1016/j.csbj.2020.06.017. eCollection 2020.
9
Modern deep learning in bioinformatics.生物信息学中的现代深度学习
J Mol Cell Biol. 2020 Oct 30;12(11):823-827. doi: 10.1093/jmcb/mjaa030.
10
A benchmark study of ab initio gene prediction methods in diverse eukaryotic organisms.不同真核生物中从头基因预测方法的基准研究。
BMC Genomics. 2020 Apr 9;21(1):293. doi: 10.1186/s12864-020-6707-9.