pLoc-mEuk：通过将关键 GO 信息提取到通用 PseAAC 中，预测多标签真核蛋白质的亚细胞定位。

pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC.

机构信息

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China.

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China; The Gordon Life Science Institute, Boston, MA 02478, USA.

出版信息

Genomics. 2018 Jan;110(1):50-58. doi: 10.1016/j.ygeno.2017.08.005. Epub 2017 Aug 14.

DOI:10.1016/j.ygeno.2017.08.005

PMID:28818512

Abstract

Many efforts have been made in predicting the subcellular localization of eukaryotic proteins, but most of the existing methods have the following two limitations: (1) their coverage scope is less than ten locations and hence many organelles in an eukaryotic cell cannot be covered, and (2) they can only be used to deal with single-label systems in which each of the constituent proteins has one and only one location. Actually, proteins with multiple locations are particularly interesting since they may have some exceptional functions very important for in-depth understanding the biological process in a cell and for selecting drug target as well. Although several predictors (such as "Euk-mPLoc", "Euk-PLoc 2.0" and "iLoc-Euk") can cover up to 22 different location sites, and they also have the function to treat multi-labeled proteins, further efforts are needed to improve their prediction quality, particularly in enhancing the absolute true rate and in reducing the absolute false rate. Here we propose a new predictor called "pLoc-mEuk" by extracting the key GO (Gene Ontology) information into the general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validations on a high-quality and stringent benchmark dataset have indicated that the proposed pLoc-mEuk predictor is remarkably superior to iLoc-Euk, the best of the aforementioned three predictors. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mEuk/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.

摘要

许多研究都致力于预测真核生物蛋白质的亚细胞定位，但大多数现有的方法都存在以下两个局限性：（1）它们的覆盖范围小于十个位置，因此真核细胞中的许多细胞器无法被覆盖；（2）它们只能用于处理单标签系统，其中每个组成蛋白质只有一个且唯一的位置。实际上，具有多个位置的蛋白质特别有趣，因为它们可能具有一些特殊功能，对于深入了解细胞中的生物学过程以及选择药物靶点非常重要。尽管有几个预测器（如“Euk-mPLoc”、“Euk-PLoc 2.0”和“iLoc-Euk”）可以覆盖多达 22 个不同的位置，并且它们还具有处理多标签蛋白质的功能，但仍需要进一步努力来提高它们的预测质量，特别是在提高绝对真实率和降低绝对假率方面。在这里，我们提出了一个新的预测器，称为“pLoc-mEuk”，通过将关键 GO（基因本体论）信息提取到通用 PseAAC（伪氨基酸组成）中。在一个高质量和严格的基准数据集上进行的严格交叉验证表明，所提出的 pLoc-mEuk 预测器明显优于 iLoc-Euk，是上述三个预测器中最好的。为了最大限度地方便大多数实验科学家，我们在 http://www.jci-bioinfo.cn/pLoc-mEuk/ 上建立了一个新预测器的用户友好型网络服务器，用户可以轻松获得他们所需的结果，而无需经历涉及的复杂数学运算。

相似文献

pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC.pLoc-mEuk：通过将关键 GO 信息提取到通用 PseAAC 中，预测多标签真核蛋白质的亚细胞定位。

Genomics. 2018 Jan;110(1):50-58. doi: 10.1016/j.ygeno.2017.08.005. Epub 2017 Aug 14.

pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset.pLoc_bal-mEuk：基于通用伪氨基酸组成和准平衡训练数据集预测真核生物蛋白质的亚细胞定位

Med Chem. 2019;15(5):472-485. doi: 10.2174/1573406415666181218102517.

pLoc-mVirus: Predict subcellular localization of multi-location virus proteins via incorporating the optimal GO information into general PseAAC.pLoc-mVirus：通过将最优的基因本体（GO）信息整合到通用的伪氨基酸组成（PseAAC）中来预测多定位病毒蛋白的亚细胞定位

Gene. 2017 Sep 10;628:315-321. doi: 10.1016/j.gene.2017.07.036. Epub 2017 Jul 18.

pLoc-mHum: predict subcellular localization of multi-location human proteins via general PseAAC to winnow out the crucial GO information.pLoc-mHum：通过通用 PseAAC 预测多定位人类蛋白质的亚细胞定位，以筛选出关键的 GO 信息。

Bioinformatics. 2018 May 1;34(9):1448-1456. doi: 10.1093/bioinformatics/btx711.

pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset.pLoc_bal-mVirus：基于周式广义伪氨基酸组成和用于平衡训练数据集的迭代启发式阈值选择处理预测多标签病毒蛋白的亚细胞定位

Med Chem. 2019;15(5):496-509. doi: 10.2174/1573406415666181217114710.

pLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC.pLoc_bal-mGpos：通过准平衡训练数据集和 PseAAC 预测革兰氏阳性菌蛋白质的亚细胞定位

Genomics. 2019 Jul;111(4):886-892. doi: 10.1016/j.ygeno.2018.05.017. Epub 2018 May 26.

pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC.pLoc-mPlant：通过将最优的基因本体（GO）信息整合到通用的伪氨基酸组成（PseAAC）中，预测多定位植物蛋白的亚细胞定位

Mol Biosyst. 2017 Aug 22;13(9):1722-1727. doi: 10.1039/c7mb00267j.

pLoc-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC.pLoc-mGneg：通过基于通用伪氨基酸组成的深度基因本体学习预测革兰氏阴性菌蛋白质的亚细胞定位。

Genomics. 2017 Oct 6. doi: 10.1016/j.ygeno.2017.10.002.

pLoc_bal-mGneg: Predict subcellular localization of Gram-negative bacterial proteins by quasi-balancing training dataset and general PseAAC.pLoc_bal-mGneg：通过准平衡训练数据集和广义 PseAAC 预测革兰氏阴性细菌蛋白质的亚细胞定位。

J Theor Biol. 2018 Dec 7;458:92-102. doi: 10.1016/j.jtbi.2018.09.005. Epub 2018 Sep 8.

pLoc_bal-mPlant: Predict Subcellular Localization of Plant Proteins by General PseAAC and Balancing Training Dataset.pLoc_bal-mPlant：基于广义 PseAAC 和平衡训练数据集预测植物蛋白的亚细胞定位

Curr Pharm Des. 2018;24(34):4013-4022. doi: 10.2174/1381612824666181119145030.

引用本文的文献

iDLB-Pred: identification of disordered lipid binding residues in protein sequences using convolutional neural network.iDLB-Pred：使用卷积神经网络鉴定蛋白质序列中紊乱脂质结合残基

Sci Rep. 2024 Oct 21;14(1):24724. doi: 10.1038/s41598-024-75700-x.

RMTLysPTM: recognizing multiple types of lysine PTM sites by deep analysis on sequences.RMTLysPTM：通过对序列进行深度分析来识别多种类型的赖氨酸翻译后修饰位点

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad450.

Hemolytic-Pred: A machine learning-based predictor for hemolytic proteins using position and composition-based features.溶血预测器：一种基于机器学习的溶血蛋白预测工具，使用基于位置和组成的特征。

Digit Health. 2023 Jul 5;9:20552076231180739. doi: 10.1177/20552076231180739. eCollection 2023 Jan-Dec.

Identification of Potential Proteinaceous Ligands of GI.1 Norovirus in Pacific Oyster Tissues.鉴定太平洋牡蛎组织中 GI.1 诺如病毒的潜在蛋白配体。

Viruses. 2023 Feb 25;15(3):631. doi: 10.3390/v15030631.

Computational prediction of disordered binding regions.无序结合区域的计算预测

Comput Struct Biotechnol J. 2023 Feb 10;21:1487-1497. doi: 10.1016/j.csbj.2023.02.018. eCollection 2023.

A proteome-wide systems toxicological approach deciphers the interaction network of chemotherapeutic drugs in the cardiovascular milieu.一种全蛋白质组范围的系统毒理学方法解析了心血管环境中化疗药物的相互作用网络。

RSC Adv. 2018 Jun 4;8(36):20211-20221. doi: 10.1039/c8ra02877j. eCollection 2018 May 30.

Genome-Wide Identification and Expression Analysis of SNARE Genes in .. 中SNARE基因的全基因组鉴定与表达分析

Plants (Basel). 2022 Mar 7;11(5):711. doi: 10.3390/plants11050711.

IDDLncLoc: Subcellular Localization of LncRNAs Based on a Framework for Imbalanced Data Distributions.IDDLncLoc：基于不平衡数据分布框架的 lncRNAs 亚细胞定位。

Interdiscip Sci. 2022 Jun;14(2):409-420. doi: 10.1007/s12539-021-00497-6. Epub 2022 Feb 22.

ProtPlat: an efficient pre-training platform for protein classification based on FastText.ProtPlat：基于 FastText 的高效蛋白质分类预训练平台。

BMC Bioinformatics. 2022 Feb 11;23(1):66. doi: 10.1186/s12859-022-04604-2.

Multiple Protein Subcellular Locations Prediction Based on Deep Convolutional Neural Networks with Self-Attention Mechanism.基于具有自注意力机制的深度卷积神经网络的多种蛋白质亚细胞定位预测。

Interdiscip Sci. 2022 Jun;14(2):421-438. doi: 10.1007/s12539-021-00496-7. Epub 2022 Jan 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

pLoc-mEuk：通过将关键 GO 信息提取到通用 PseAAC 中，预测多标签真核蛋白质的亚细胞定位。

pLoc-mEuk: Predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献