• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用网络和功能嵌入识别蛋白质亚细胞定位

Identification of Protein Subcellular Localization With Network and Functional Embeddings.

作者信息

Pan Xiaoyong, Li Hao, Zeng Tao, Li Zhandong, Chen Lei, Huang Tao, Cai Yu-Dong

机构信息

School of Life Sciences, Shanghai University, Shanghai, China.

Key Laboratory of System Control and Information Processing, Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Ministry of Education of China, Shanghai, China.

出版信息

Front Genet. 2021 Jan 20;11:626500. doi: 10.3389/fgene.2020.626500. eCollection 2020.

DOI:10.3389/fgene.2020.626500
PMID:33584818
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7873866/
Abstract

The functions of proteins are mainly determined by their subcellular localizations in cells. Currently, many computational methods for predicting the subcellular localization of proteins have been proposed. However, these methods require further improvement, especially when used in protein representations. In this study, we present an embedding-based method for predicting the subcellular localization of proteins. We first learn the functional embeddings of KEGG/GO terms, which are further used in representing proteins. Then, we characterize the network embeddings of proteins on a protein-protein network. The functional and network embeddings are combined as novel representations of protein locations for the construction of the final classification model. In our collected benchmark dataset with 4,861 proteins from 16 locations, the best model shows a Matthews correlation coefficient of 0.872 and is thus superior to multiple conventional methods.

摘要

蛋白质的功能主要由其在细胞中的亚细胞定位决定。目前,已经提出了许多用于预测蛋白质亚细胞定位的计算方法。然而,这些方法需要进一步改进,特别是在用于蛋白质表示时。在本研究中,我们提出了一种基于嵌入的方法来预测蛋白质的亚细胞定位。我们首先学习KEGG/GO术语的功能嵌入,这些嵌入进一步用于表示蛋白质。然后,我们在蛋白质-蛋白质网络上表征蛋白质的网络嵌入。功能嵌入和网络嵌入相结合,作为蛋白质定位的新表示,用于构建最终的分类模型。在我们收集的包含来自16个定位的4861种蛋白质的基准数据集中,最佳模型的马修斯相关系数为0.872,因此优于多种传统方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/8269a8fa2dac/fgene-11-626500-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/89840e6f5db7/fgene-11-626500-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/4e7cb9e011f8/fgene-11-626500-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/8b84f04a75ba/fgene-11-626500-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/867a6c3c3bf3/fgene-11-626500-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/8269a8fa2dac/fgene-11-626500-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/89840e6f5db7/fgene-11-626500-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/4e7cb9e011f8/fgene-11-626500-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/8b84f04a75ba/fgene-11-626500-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/867a6c3c3bf3/fgene-11-626500-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce62/7873866/8269a8fa2dac/fgene-11-626500-g0005.jpg

相似文献

1
Identification of Protein Subcellular Localization With Network and Functional Embeddings.利用网络和功能嵌入识别蛋白质亚细胞定位
Front Genet. 2021 Jan 20;11:626500. doi: 10.3389/fgene.2020.626500. eCollection 2020.
2
Identifying Protein Subcellular Locations With Embeddings-Based node2loc.基于嵌入的 node2loc 识别蛋白亚细胞位置
IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):666-675. doi: 10.1109/TCBB.2021.3080386. Epub 2022 Apr 1.
3
GO2Vec: transforming GO terms and proteins to vector representations via graph embeddings.GO2Vec:通过图嵌入将 GO 术语和蛋白质转换为向量表示。
BMC Genomics. 2019 Dec 24;20(Suppl 9):918. doi: 10.1186/s12864-019-6272-2.
4
Predicting Human Protein Subcellular Locations by Using a Combination of Network and Function Features.结合网络和功能特征预测人类蛋白质亚细胞定位
Front Genet. 2021 Nov 5;12:783128. doi: 10.3389/fgene.2021.783128. eCollection 2021.
5
Predicting protein subcellular location with network embedding and enrichment features.利用网络嵌入和富集特征预测蛋白质亚细胞定位。
Biochim Biophys Acta Proteins Proteom. 2020 Oct;1868(10):140477. doi: 10.1016/j.bbapap.2020.140477. Epub 2020 Jun 25.
6
lncLocator 2.0: a cell-line-specific subcellular localization predictor for long non-coding RNAs with interpretable deep learning.lncLocator 2.0:一种通过可解释深度学习预测长链非编码RNA细胞系特异性亚细胞定位的工具
Bioinformatics. 2021 Aug 25;37(16):2308-2316. doi: 10.1093/bioinformatics/btab127.
7
Hum-mPLoc 3.0: prediction enhancement of human protein subcellular localization through modeling the hidden correlations of gene ontology and functional domain features.Hum-mPLoc 3.0:通过对基因本体和功能域特征的隐藏相关性进行建模来增强人类蛋白质亚细胞定位预测
Bioinformatics. 2017 Mar 15;33(6):843-853. doi: 10.1093/bioinformatics/btw723.
8
Graph embeddings on gene ontology annotations for protein-protein interaction prediction.基于基因本体论注释的图嵌入在蛋白质相互作用预测中的应用。
BMC Bioinformatics. 2020 Dec 16;21(Suppl 16):560. doi: 10.1186/s12859-020-03816-8.
9
Sparse regressions for predicting and interpreting subcellular localization of multi-label proteins.用于预测和解释多标签蛋白质亚细胞定位的稀疏回归
BMC Bioinformatics. 2016 Feb 24;17:97. doi: 10.1186/s12859-016-0940-x.
10
MetaLocGramN: A meta-predictor of protein subcellular localization for Gram-negative bacteria.MetaLocGramN:革兰氏阴性菌蛋白质亚细胞定位的元预测器。
Biochim Biophys Acta. 2012 Dec;1824(12):1425-33. doi: 10.1016/j.bbapap.2012.05.018. Epub 2012 Jun 15.

引用本文的文献

1
Medium-sized protein language models perform well at transfer learning on realistic datasets.中等规模的蛋白质语言模型在真实数据集上的迁移学习中表现良好。
Sci Rep. 2025 Jul 1;15(1):21400. doi: 10.1038/s41598-025-05674-x.
2
Scaling down for efficiency: Medium-sized protein language models perform well at transfer learning on realistic datasets.为提高效率而缩小规模:中型蛋白质语言模型在真实数据集的迁移学习中表现良好。
bioRxiv. 2025 Jan 28:2024.11.22.624936. doi: 10.1101/2024.11.22.624936.
3
Navigating the human-monkeypox virus interactome: HuPoxNET atlas reveals functional insights.

本文引用的文献

1
Identifying Robust Microbiota Signatures and Interpretable Rules to Distinguish Cancer Subtypes.识别强大的微生物群特征和可解释的规则以区分癌症亚型。
Front Mol Biosci. 2020 Nov 4;7:604794. doi: 10.3389/fmolb.2020.604794. eCollection 2020.
2
Investigation and Prediction of Human Interactome Based on Quantitative Features.基于定量特征的人类相互作用组的研究与预测
Front Bioeng Biotechnol. 2020 Jul 17;8:730. doi: 10.3389/fbioe.2020.00730. eCollection 2020.
3
Alternative Polyadenylation Modification Patterns Reveal Essential Posttranscription Regulatory Mechanisms of Tumorigenesis in Multiple Tumor Types.
探索人类-猴痘病毒相互作用组:HuPoxNET图谱揭示功能见解。
Front Microbiol. 2024 Aug 2;15:1399555. doi: 10.3389/fmicb.2024.1399555. eCollection 2024.
4
Natural products can be potential inhibitors of metalloproteinase II from to intervene colorectal cancer.天然产物可能是金属蛋白酶II的潜在抑制剂,可用于干预结直肠癌。
Heliyon. 2024 Jun 13;10(12):e32838. doi: 10.1016/j.heliyon.2024.e32838. eCollection 2024 Jun 30.
5
Subtractive genomics study of Xanthomonas oryzae pv. Oryzae reveals repurposable drug candidate for the treatment of bacterial leaf blight in rice.水稻白叶枯病菌的消减基因组学研究揭示了可用于治疗水稻细菌性条斑病的可重新利用的候选药物。
J Genet Eng Biotechnol. 2024 Mar;22(1):100353. doi: 10.1016/j.jgeb.2024.100353. Epub 2024 Jan 23.
6
Recent Advancements in Subcellular Proteomics: Growing Impact of Organellar Protein Niches on the Understanding of Cell Biology.亚细胞蛋白质组学的最新进展:细胞器蛋白质龛对细胞生物学理解的影响日益增大。
J Proteome Res. 2024 Aug 2;23(8):2700-2722. doi: 10.1021/acs.jproteome.3c00839. Epub 2024 Mar 7.
7
UniKP: a unified framework for the prediction of enzyme kinetic parameters.UniKP:一种用于预测酶动力学参数的统一框架。
Nat Commun. 2023 Dec 11;14(1):8211. doi: 10.1038/s41467-023-44113-1.
8
Unannotated Open Reading Frame in Encodes Protein Localizing to the Endoplasmic Reticulum.编码定位于内质网的蛋白质的未注释开放阅读框。
MicroPubl Biol. 2023 Oct 20;2023. doi: 10.17912/micropub.biology.000992. eCollection 2023.
9
A review from biological mapping to computation-based subcellular localization.从生物图谱到基于计算的亚细胞定位的综述。
Mol Ther Nucleic Acids. 2023 Apr 20;32:507-521. doi: 10.1016/j.omtn.2023.04.015. eCollection 2023 Jun 13.
10
Survey of Protein Sequence Embedding Models.蛋白质序列嵌入模型调查。
Int J Mol Sci. 2023 Feb 14;24(4):3775. doi: 10.3390/ijms24043775.
可变聚腺苷酸化修饰模式揭示多种肿瘤类型中肿瘤发生的关键转录后调控机制。
Biomed Res Int. 2020 Jun 15;2020:6384120. doi: 10.1155/2020/6384120. eCollection 2020.
4
Predicting protein subcellular location with network embedding and enrichment features.利用网络嵌入和富集特征预测蛋白质亚细胞定位。
Biochim Biophys Acta Proteins Proteom. 2020 Oct;1868(10):140477. doi: 10.1016/j.bbapap.2020.140477. Epub 2020 Jun 25.
5
Discriminating Origin Tissues of Tumor Cell Lines by Methylation Signatures and Dys-Methylated Rules.通过甲基化特征和异常甲基化规则鉴别肿瘤细胞系的起源组织
Front Bioeng Biotechnol. 2020 May 26;8:507. doi: 10.3389/fbioe.2020.00507. eCollection 2020.
6
Prediction of Drug Side Effects with a Refined Negative Sample Selection Strategy.采用改进的负样本选择策略预测药物副作用。
Comput Math Methods Med. 2020 May 9;2020:1573543. doi: 10.1155/2020/1573543. eCollection 2020.
7
iATC-FRAKEL: a simple multi-label web server for recognizing anatomical therapeutic chemical classes of drugs with their fingerprints only.iATC-FRAKEL:一个简单的多标签网络服务器,仅使用药物的指纹识别其解剖治疗化学类别。
Bioinformatics. 2020 Jun 1;36(11):3568-3569. doi: 10.1093/bioinformatics/btaa166.
8
Copy Number Variation Pattern for Discriminating MACROD2 States of Colorectal Cancer Subtypes.用于区分结直肠癌亚型MACROD2状态的拷贝数变异模式。
Front Bioeng Biotechnol. 2019 Dec 19;7:407. doi: 10.3389/fbioe.2019.00407. eCollection 2019.
9
HIV infection alters the human epigenetic landscape.HIV 感染改变了人类的表观遗传景观。
Gene Ther. 2019 Feb;26(1-2):29-39. doi: 10.1038/s41434-018-0051-6. Epub 2018 Nov 15.
10
Identification of synthetic lethality based on a functional network by using machine learning algorithms.基于功能网络的机器学习算法识别合成致死性。
J Cell Biochem. 2019 Jan;120(1):405-416. doi: 10.1002/jcb.27395. Epub 2018 Aug 20.