• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过使用 RBRL 分类器的多视图特征学习实现多标签蛋白质亚细胞定位的准确预测。

Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier.

机构信息

College of Mathematics and Physics, Qingdao University of Science and Technology, China.

School of Mathematics and Statistics, Central South University, China.

出版信息

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab012.

DOI:10.1093/bib/bbab012
PMID:33537726
Abstract

Multi-label proteins can participate in carrier transportation, enzyme catalysis, hormone regulation and other life activities. Meanwhile, they play a key role in the fields of biopharmaceuticals, gene and cell therapy. This article proposes a prediction method called Mps-mvRBRL to predict the subcellular localization (SCL) of multi-label protein. Firstly, pseudo position-specific scoring matrix, dipeptide composition, position specific scoring matrix-transition probability composition, gene ontology and pseudo amino acid composition algorithms are used to obtain numerical information from different views. Based on the contribution of five individual feature extraction methods, differential evolution is used for the first time to learn the weight of single feature, and then these original features use a weighted combination method to fuse multi-view information. Secondly, the fused high-dimensional features use a weighted linear discriminant analysis framework based on binary weight form to eliminate irrelevant information. Finally, the best feature vector is input into the joint ranking support vector machine and binary relevance with robust low-rank learning classifier to predict the SCL. After applying leave-one-out cross-validation, the overall actual accuracy (OAA) and overall location accuracy (OLA) of Mps-mvRBRL on the training set of Gram-positive bacteria are both 99.81%. The OAA on the test sets of plant, virus and Gram-negative bacteria datasets are 97.24%, 98.55% and 98.20%, respectively, and the OLA are 97.16%, 97.62% and 98.28%, respectively. The results show that the model achieves good prediction performance for predicting the SCL of multi-label protein.

摘要

多标签蛋白可以参与载体运输、酶催化、激素调节等生命活动。同时,它们在生物制药、基因和细胞治疗等领域发挥着关键作用。本文提出了一种名为 Mps-mvRBRL 的预测方法,用于预测多标签蛋白的亚细胞定位(SCL)。首先,从不同视角利用伪位置特异性评分矩阵、二肽组成、位置特异性评分矩阵-转移概率组成、基因本体和伪氨基酸组成算法获取数值信息。基于五种个体特征提取方法的贡献,首次利用差分进化算法学习单个特征的权重,然后使用加权组合方法融合多视图信息。其次,融合后的高维特征采用基于二进制权重形式的加权线性判别分析框架,消除无关信息。最后,将最佳特征向量输入联合排序支持向量机和二进制相关性鲁棒低秩学习分类器,以预测 SCL。应用留一交叉验证后,Mps-mvRBRL 在革兰氏阳性菌训练集上的总体实际准确率(OAA)和总体定位准确率(OLA)均为 99.81%。在植物、病毒和革兰氏阴性菌数据集的测试集上,OAA 分别为 97.24%、98.55%和 98.20%,OLA 分别为 97.16%、97.62%和 98.28%。结果表明,该模型在预测多标签蛋白的 SCL 方面取得了良好的预测性能。

相似文献

1
Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier.通过使用 RBRL 分类器的多视图特征学习实现多标签蛋白质亚细胞定位的准确预测。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab012.
2
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines.mGOASVM:基于基因本体和支持向量机的多标签蛋白质亚细胞定位。
BMC Bioinformatics. 2012 Nov 6;13:290. doi: 10.1186/1471-2105-13-290.
3
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
4
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
5
Multi-location gram-positive and gram-negative bacterial protein subcellular localization using gene ontology and multi-label classifier ensemble.利用基因本体论和多标签分类器集成进行多地点革兰氏阳性和革兰氏阴性细菌蛋白质亚细胞定位
BMC Bioinformatics. 2015;16 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-16-S12-S1. Epub 2015 Aug 25.
6
Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.基于概率潜在语义索引的核转位信号预测核蛋白。
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.
7
Predicting the multi-label protein subcellular localization through multi-information fusion and MLSI dimensionality reduction based on MLFE classifier.基于 MLFE 分类器的多信息融合和 MLSI 降维预测多标签蛋白质亚细胞定位。
Bioinformatics. 2022 Feb 7;38(5):1223-1230. doi: 10.1093/bioinformatics/btab811.
8
CrystalM: A Multi-View Fusion Approach for Protein Crystallization Prediction.CrystalM:一种用于蛋白质结晶预测的多视图融合方法。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):325-335. doi: 10.1109/TCBB.2019.2912173. Epub 2021 Feb 3.
9
Prediction of protein subcellular localization with oversampling approach and Chou's general PseAAC.基于过采样方法和周式广义伪氨基酸组成预测蛋白质亚细胞定位
J Theor Biol. 2018 Jan 21;437:239-250. doi: 10.1016/j.jtbi.2017.10.030. Epub 2017 Oct 31.
10
Subcellular location prediction of apoptosis proteins using two novel feature extraction methods based on evolutionary information and LDA.基于进化信息和 LDA 的两种新特征提取方法对凋亡蛋白的亚细胞定位预测
BMC Bioinformatics. 2020 May 24;21(1):212. doi: 10.1186/s12859-020-3539-1.

引用本文的文献

1
NFEmbed: modeling nitrogenase activity via classification and regression with pretrained protein embeddings.NFEmbed:通过使用预训练蛋白质嵌入进行分类和回归来模拟固氮酶活性。
Bioinform Adv. 2025 Aug 23;5(1):vbaf204. doi: 10.1093/bioadv/vbaf204. eCollection 2025.
2
Carmna: classification and regression models for nitrogenase activity based on a pretrained large protein language model.卡尔姆纳:基于预训练大型蛋白质语言模型的固氮酶活性分类与回归模型
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf197.
3
A Review for Artificial Intelligence Based Protein Subcellular Localization.
基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
4
Computational prediction of disordered binding regions.无序结合区域的计算预测
Comput Struct Biotechnol J. 2023 Feb 10;21:1487-1497. doi: 10.1016/j.csbj.2023.02.018. eCollection 2023.
5
PScL-2LSAESM: bioimage-based prediction of protein subcellular localization by integrating heterogeneous features with the two-level SAE-SM and mean ensemble method.PScL-2LSAESM:基于生物图像的蛋白质亚细胞定位预测,通过整合异质特征与两级 SAE-SM 和均值集成方法。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac727.
6
Gm-PLoc: A Subcellular Localization Model of Multi-Label Protein Based on GAN and DeepFM.Gm-PLoc:一种基于生成对抗网络(GAN)和深度因子分解机(DeepFM)的多标签蛋白质亚细胞定位模型
Front Genet. 2022 Jun 15;13:912614. doi: 10.3389/fgene.2022.912614. eCollection 2022.
7
PScL-DDCFPred: an ensemble deep learning-based approach for characterizing multiclass subcellular localization of human proteins from bioimage data.PScL-DDCFPred:一种基于集成深度学习的方法,用于从生物图像数据中描述人类蛋白质的多类亚细胞定位。
Bioinformatics. 2022 Aug 10;38(16):4019-4026. doi: 10.1093/bioinformatics/btac432.
8
PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.PScL-HDeep:基于图像的人类组织蛋白亚细胞定位预测,使用基于手工和深度学习特征的两层特征选择的集成学习方法。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab278.