• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 MLFE 分类器的多信息融合和 MLSI 降维预测多标签蛋白质亚细胞定位。

Predicting the multi-label protein subcellular localization through multi-information fusion and MLSI dimensionality reduction based on MLFE classifier.

机构信息

College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China.

Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao 266061, China.

出版信息

Bioinformatics. 2022 Feb 7;38(5):1223-1230. doi: 10.1093/bioinformatics/btab811.

DOI:10.1093/bioinformatics/btab811
PMID:34864897
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8690230/
Abstract

MOTIVATION

Multi-label (ML) protein subcellular localization (SCL) is an indispensable way to study protein function. It can locate a certain protein (such as the human transmembrane protein that promotes the invasion of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)) or expression product at a specific location in a cell, which can provide a reference for clinical treatment of diseases such as coronavirus disease 2019 (COVID-19).

RESULTS

The article proposes a novel method named ML-locMLFE. First of all, six feature extraction methods are adopted to obtain protein effective information. These methods include pseudo amino acid composition, encoding based on grouped weight, gene ontology, multi-scale continuous and discontinuous, residue probing transformation and evolutionary distance transformation. In the next part, we utilize the ML information latent semantic index method to avoid the interference of redundant information. In the end, ML learning with feature-induced labeling information enrichment is adopted to predict the ML protein SCL. The Gram-positive bacteria dataset is chosen as a training set, while the Gram-negative bacteria dataset, virus dataset, newPlant dataset and SARS-CoV-2 dataset as the test sets. The overall actual accuracy of the first four datasets are 99.23%, 93.82%, 93.24% and 96.72% by the leave-one-out cross validation. It is worth mentioning that the overall actual accuracy prediction result of our predictor on the SARS-CoV-2 dataset is 72.73%. The results indicate that the ML-locMLFE method has obvious advantages in predicting the SCL of ML protein, which provides new ideas for further research on the SCL of ML protein.

AVAILABILITY AND IMPLEMENTATION

The source codes and datasets are publicly available at https://github.com/QUST-AIBBDRC/ML-locMLFE/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

多标签(ML)蛋白质亚细胞定位(SCL)是研究蛋白质功能不可或缺的方法。它可以定位特定的蛋白质(例如,促进严重急性呼吸综合征冠状病毒 2(SARS-CoV-2)入侵的人跨膜蛋白)或其表达产物在细胞中的特定位置,这可以为 2019 年冠状病毒病(COVID-19)等疾病的临床治疗提供参考。

结果

本文提出了一种名为 ML-locMLFE 的新方法。首先,采用六种特征提取方法获取蛋白质有效信息。这些方法包括拟氨基酸组成、基于分组权重的编码、基因本体、多尺度连续和不连续、残基探测变换和进化距离变换。在接下来的部分中,我们利用 ML 信息潜在语义索引方法来避免冗余信息的干扰。最后,采用 ML 学习与特征诱导标记信息丰富相结合的方法来预测 ML 蛋白质 SCL。革兰氏阳性菌数据集作为训练集,革兰氏阴性菌数据集、病毒数据集、新植物数据集和 SARS-CoV-2 数据集作为测试集。通过留一交叉验证,前四个数据集的总体实际准确率分别为 99.23%、93.82%、93.24%和 96.72%。值得一提的是,我们的预测器对 SARS-CoV-2 数据集的整体实际准确率预测结果为 72.73%。结果表明,ML-locMLFE 方法在预测 ML 蛋白质的 SCL 方面具有明显的优势,为进一步研究 ML 蛋白质的 SCL 提供了新的思路。

可用性和实现

源代码和数据集可在 https://github.com/QUST-AIBBDRC/ML-locMLFE/ 上公开获取。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

1
Predicting the multi-label protein subcellular localization through multi-information fusion and MLSI dimensionality reduction based on MLFE classifier.基于 MLFE 分类器的多信息融合和 MLSI 降维预测多标签蛋白质亚细胞定位。
Bioinformatics. 2022 Feb 7;38(5):1223-1230. doi: 10.1093/bioinformatics/btab811.
2
Accurate prediction of multi-label protein subcellular localization through multi-view feature learning with RBRL classifier.通过使用 RBRL 分类器的多视图特征学习实现多标签蛋白质亚细胞定位的准确预测。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab012.
3
SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting.SubMito-XGBoost:通过融合多种特征信息和极端梯度提升预测蛋白质亚线粒体定位。
Bioinformatics. 2020 Feb 15;36(4):1074-1081. doi: 10.1093/bioinformatics/btz734.
4
ML-FGAT: Identification of multi-label protein subcellular localization by interpretable graph attention networks and feature-generative adversarial networks.ML-FGAT:基于可解释图注意网络和特征生成对抗网络的多标签蛋白质亚细胞定位识别。
Comput Biol Med. 2024 Mar;170:107944. doi: 10.1016/j.compbiomed.2024.107944. Epub 2024 Jan 2.
5
Transfer learning via multi-scale convolutional neural layers for human-virus protein-protein interaction prediction.基于多尺度卷积神经网络的迁移学习方法在人类-病毒蛋白质相互作用预测中的应用。
Bioinformatics. 2021 Dec 11;37(24):4771-4778. doi: 10.1093/bioinformatics/btab533.
6
Predicting protein-protein interactions by fusing various Chou's pseudo components and using wavelet denoising approach.通过融合各种周伪氨基酸组成成分并使用小波去噪方法来预测蛋白质-蛋白质相互作用。
J Theor Biol. 2019 Feb 7;462:329-346. doi: 10.1016/j.jtbi.2018.11.011. Epub 2018 Nov 16.
7
pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset.pLoc_bal-mVirus:基于周式广义伪氨基酸组成和用于平衡训练数据集的迭代启发式阈值选择处理预测多标签病毒蛋白的亚细胞定位
Med Chem. 2019;15(5):496-509. doi: 10.2174/1573406415666181217114710.
8
mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines.mGOASVM:基于基因本体和支持向量机的多标签蛋白质亚细胞定位。
BMC Bioinformatics. 2012 Nov 6;13:290. doi: 10.1186/1471-2105-13-290.
9
SLPred: a multi-view subcellular localization prediction tool for multi-location human proteins.SLPred:一种用于多定位人类蛋白质的多视图亚细胞定位预测工具。
Bioinformatics. 2022 Sep 2;38(17):4226-4229. doi: 10.1093/bioinformatics/btac458.
10
MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization.MSlocPRED:基于深度迁移学习的多标签 mRNA 亚细胞定位识别。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae504.

引用本文的文献

1
Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training.基于多任务协同训练的蛋白质多标签亚细胞定位和功能预测深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae568.
2
A Review for Artificial Intelligence Based Protein Subcellular Localization.基于人工智能的蛋白质亚细胞定位研究综述
Biomolecules. 2024 Mar 27;14(4):409. doi: 10.3390/biom14040409.
3
Recent Advances in the Prediction of Subcellular Localization of Proteins and Related Topics.蛋白质亚细胞定位预测及相关主题的最新进展
Front Bioinform. 2022 May 19;2:910531. doi: 10.3389/fbinf.2022.910531. eCollection 2022.