• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DeepLoc 2.0:使用蛋白质语言模型进行多标签亚细胞定位预测。

DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.

机构信息

Indian Institute of Technology Madras, Chennai 600036, India.

Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen 2200, Denmark.

出版信息

Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.

DOI:10.1093/nar/gkac278
PMID:35489069
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9252801/
Abstract

The prediction of protein subcellular localization is of great relevance for proteomics research. Here, we propose an update to the popular tool DeepLoc with multi-localization prediction and improvements in both performance and interpretability. For training and validation, we curate eukaryotic and human multi-location protein datasets with stringent homology partitioning and enriched with sorting signal information compiled from the literature. We achieve state-of-the-art performance in DeepLoc 2.0 by using a pre-trained protein language model. It has the further advantage that it uses sequence input rather than relying on slower protein profiles. We provide two means of better interpretability: an attention output along the sequence and highly accurate prediction of nine different types of protein sorting signals. We find that the attention output correlates well with the position of sorting signals. The webserver is available at services.healthtech.dtu.dk/service.php?DeepLoc-2.0.

摘要

蛋白质亚细胞定位预测对于蛋白质组学研究具有重要意义。在这里,我们对流行的 DeepLoc 工具进行了更新,增加了多定位预测功能,并在性能和可解释性方面进行了改进。在训练和验证过程中,我们使用严格的同源分区方法整理了真核生物和人类多定位蛋白质数据集,并从文献中收集了丰富的分选信号信息。我们使用预训练的蛋白质语言模型在 DeepLoc 2.0 中实现了最先进的性能。它还有一个进一步的优势,即它使用序列输入,而不是依赖于较慢的蛋白质图谱。我们提供了两种更好的可解释性方法:沿序列的注意力输出和对九种不同类型蛋白质分选信号的高度准确预测。我们发现,注意力输出与分选信号的位置密切相关。该网络服务器可在 services.healthtech.dtu.dk/service.php?DeepLoc-2.0 访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/ece70965d9aa/gkac278fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/793890fdbe23/gkac278figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/25d4d35b5896/gkac278fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/ece70965d9aa/gkac278fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/793890fdbe23/gkac278figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/25d4d35b5896/gkac278fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f2/9252801/ece70965d9aa/gkac278fig2.jpg

相似文献

1
DeepLoc 2.0: multi-label subcellular localization prediction using protein language models.DeepLoc 2.0:使用蛋白质语言模型进行多标签亚细胞定位预测。
Nucleic Acids Res. 2022 Jul 5;50(W1):W228-W234. doi: 10.1093/nar/gkac278.
2
DeepLoc 2.1: multi-label membrane protein type prediction using protein language models.DeepLoc 2.1:使用蛋白质语言模型进行多标签膜蛋白类型预测。
Nucleic Acids Res. 2024 Jul 5;52(W1):W215-W220. doi: 10.1093/nar/gkae237.
3
DeepLoc: prediction of protein subcellular localization using deep learning.DeepLoc:使用深度学习进行蛋白质亚细胞定位预测。
Bioinformatics. 2017 Nov 1;33(21):3387-3395. doi: 10.1093/bioinformatics/btx431.
4
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
5
Protein subcellular localization prediction using artificial intelligence technology.利用人工智能技术进行蛋白质亚细胞定位预测。
Methods Mol Biol. 2008;484:435-63. doi: 10.1007/978-1-59745-398-1_27.
6
pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset.pLoc_bal-mEuk:基于通用伪氨基酸组成和准平衡训练数据集预测真核生物蛋白质的亚细胞定位
Med Chem. 2019;15(5):472-485. doi: 10.2174/1573406415666181218102517.
7
DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism.DM3Loc:基于多头自注意力机制的多标签 mRNA 亚细胞定位预测与分析。
Nucleic Acids Res. 2021 May 7;49(8):e46. doi: 10.1093/nar/gkab016.
8
HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source.HPSLPred:一种用于人类蛋白质亚细胞定位预测的集成多标签分类器,源数据不均衡。
Proteomics. 2017 Sep;17(17-18). doi: 10.1002/pmic.201700262.
9
Prediction of the subcellular localization of eukaryotic proteins using sequence signals and composition.利用序列信号和组成预测真核生物蛋白质的亚细胞定位
Proteomics. 2004 Jun;4(6):1591-6. doi: 10.1002/pmic.200300769.
10
Computational prediction of subcellular localization.亚细胞定位的计算预测
Methods Mol Biol. 2007;390:429-66. doi: 10.1007/978-1-59745-466-7_29.

引用本文的文献

1
In silico evaluation of Toxoplasma gondii rhoptry neck proteins (TgRONs) for potential immunogenic epitopes.对弓形虫棒状体颈部蛋白(TgRONs)潜在免疫原性表位的计算机模拟评估。
EXCLI J. 2025 Jul 10;24:749-773. doi: 10.17179/excli2025-8304. eCollection 2025.
2
In silico expression analysis of germin-like protein genes from rice cultivar Nipponbare.水稻品种日本晴中类萌发素蛋白基因的电子表达分析
Mol Biol Rep. 2025 Aug 22;52(1):840. doi: 10.1007/s11033-025-10936-y.
3
LocPro: A deep learning-based prediction of protein subcellular localization for promoting multi-directional pharmaceutical research.

本文引用的文献

1
Light attention predicts protein location from the language of life.轻注意力从生命语言中预测蛋白质位置。
Bioinform Adv. 2021 Nov 19;1(1):vbab035. doi: 10.1093/bioadv/vbab035. eCollection 2021.
2
ProteinBERT: a universal deep-learning model of protein sequence and function.蛋白质 BERT:一种通用的蛋白质序列和功能深度学习模型。
Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020.
3
SignalP 6.0 predicts all five types of signal peptides using protein language models.SignalP 6.0 使用蛋白质语言模型预测所有五种类型的信号肽。
LocPro:基于深度学习的蛋白质亚细胞定位预测,以促进多方向药物研究。
J Pharm Anal. 2025 Aug;15(8):101255. doi: 10.1016/j.jpha.2025.101255. Epub 2025 Mar 5.
4
Orthrus: a Pumilio-family gene involved in fruiting body and dark stipe development in .鄂图鲁斯:一种参与[具体物种]子实体和深色菌柄发育的Pumilio家族基因。 (注:原文中“in.”后面缺少具体物种信息)
Front Fungal Biol. 2025 Jul 30;6:1633301. doi: 10.3389/ffunb.2025.1633301. eCollection 2025.
5
Application of Protein Structure Encodings and Sequence Embeddings for Transporter Substrate Prediction.蛋白质结构编码和序列嵌入在转运蛋白底物预测中的应用。
Molecules. 2025 Aug 1;30(15):3226. doi: 10.3390/molecules30153226.
6
A Novel Cysteine Protease from Cleaves Pokeweed Antiviral Protein Generating Bioactive Fragments.一种来自[具体来源未给出]的新型半胱氨酸蛋白酶可切割商陆抗病毒蛋白以产生生物活性片段。
Plants (Basel). 2025 Aug 7;14(15):2441. doi: 10.3390/plants14152441.
7
Selective loss of ATP carriers in favour of SLC25A43 orthologues in metamonad mitochondria adapted to anaerobiosis.在适应无氧环境的单鞭毛虫线粒体中,ATP载体选择性缺失,有利于SLC25A43直系同源物。
Open Biol. 2025 Aug;15(8):240202. doi: 10.1098/rsob.240202. Epub 2025 Aug 13.
8
Prediction of liquid-phase separation proteins using Siamese network with feature fusion.基于特征融合的连体网络预测液相分离蛋白
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf393.
9
Wood decay under anoxia by the brown-rot fungus Fomitopsis pinicola.缺氧条件下褐腐菌松杉暗孔菌引起的木材腐朽
Nat Commun. 2025 Aug 9;16(1):7352. doi: 10.1038/s41467-025-62567-3.
10
Expanded genetic and functional diversity of oceanic fungi.海洋真菌不断扩展的遗传和功能多样性。
Microbiome. 2025 Aug 4;13(1):179. doi: 10.1186/s40168-025-02162-2.
Nat Biotechnol. 2022 Jul;40(7):1023-1025. doi: 10.1038/s41587-021-01156-3. Epub 2022 Jan 3.
4
Deep protein representations enable recombinant protein expression prediction.深度蛋白质表示可实现重组蛋白表达预测。
Comput Biol Chem. 2021 Dec;95:107596. doi: 10.1016/j.compbiolchem.2021.107596. Epub 2021 Oct 27.
5
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
6
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.
7
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
8
Detecting sequence signals in targeting peptides using deep learning.利用深度学习检测靶向肽中的序列信号。
Life Sci Alliance. 2019 Sep 30;2(5). doi: 10.26508/lsa.201900429. Print 2019 Oct.
9
A Brief History of Protein Sorting Prediction.蛋白质分拣预测简史。
Protein J. 2019 Jun;38(3):200-216. doi: 10.1007/s10930-019-09838-3.
10
Subcellular Localization and Dynamics of the Bcl-2 Family of Proteins.Bcl-2蛋白家族的亚细胞定位与动态变化
Front Cell Dev Biol. 2018 Feb 13;6:13. doi: 10.3389/fcell.2018.00013. eCollection 2018.