一个用于预测真核细胞中蛋白质定位位点的知识库。

A knowledge base for predicting protein localization sites in eukaryotic cells.

作者信息

Nakai K, Kanehisa M

机构信息

Institute for Chemical Research, Kyoto University, Japan.

出版信息

Genomics. 1992 Dec;14(4):897-911. doi: 10.1016/s0888-7543(05)80111-9.

DOI:10.1016/s0888-7543(05)80111-9

PMID:1478671

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7134799/

Abstract

To automate examination of massive amounts of sequence data for biological function, it is important to computerize interpretation based on empirical knowledge of sequence-function relationships. For this purpose, we have been constructing a knowledge base by organizing various experimental and computational observations as a collection of if-then rules. Here we report an expert system, which utilizes this knowledge base, for predicting localization sites of proteins only from the information on the amino acid sequence and the source origin. We collected data for 401 eukaryotic proteins with known localization sites (subcellular and extracellular) and divided them into training data and testing data. Fourteen localization sites were distinguished for animal cells and 17 for plant cells. When sorting signals were not well characterized experimentally, various sequence features were computationally derived from the training data. It was found that 66% of the training data and 59% of the testing data were correctly predicted by our expert system. This artificial intelligence approach is powerful and flexible enough to be used in genome analyses.

摘要

为了实现对大量序列数据进行生物功能的自动化检测，基于序列 - 功能关系的经验知识进行计算机化解读非常重要。为此，我们通过将各种实验和计算观察结果组织成一系列“如果 - 那么”规则来构建一个知识库。在此，我们报告一个利用该知识库的专家系统，它仅根据氨基酸序列信息和来源就能预测蛋白质的定位位点。我们收集了401个具有已知定位位点（亚细胞和细胞外）的真核生物蛋白质的数据，并将它们分为训练数据和测试数据。动物细胞区分出14个定位位点，植物细胞区分出17个定位位点。当分选信号在实验上没有得到很好的表征时，从训练数据中通过计算得出各种序列特征。结果发现，我们的专家系统正确预测了66%的训练数据和59%的测试数据。这种人工智能方法强大且灵活，足以用于基因组分析。

相似文献

A knowledge base for predicting protein localization sites in eukaryotic cells.一个用于预测真核细胞中蛋白质定位位点的知识库。

Genomics. 1992 Dec;14(4):897-911. doi: 10.1016/s0888-7543(05)80111-9.

Expert system for predicting protein localization sites in gram-negative bacteria.用于预测革兰氏阴性菌中蛋白质定位位点的专家系统。

Proteins. 1991;11(2):95-110. doi: 10.1002/prot.340110203.

Protein subcellular localization prediction using artificial intelligence technology.利用人工智能技术进行蛋白质亚细胞定位预测。

Methods Mol Biol. 2008;484:435-63. doi: 10.1007/978-1-59745-398-1_27.

SubCellProt: predicting protein subcellular localization using machine learning approaches.SubCellProt：使用机器学习方法预测蛋白质亚细胞定位。

In Silico Biol. 2009;9(1-2):35-44.

BaCelLo: a balanced subcellular localization predictor.BaCelLo：一种平衡的亚细胞定位预测器。

Bioinformatics. 2006 Jul 15;22(14):e408-16. doi: 10.1093/bioinformatics/btl222.

Prediction of protein subcellular localization.蛋白质亚细胞定位预测

Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.

pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset.pLoc_bal-mEuk：基于通用伪氨基酸组成和准平衡训练数据集预测真核生物蛋白质的亚细胞定位

Med Chem. 2019;15(5):472-485. doi: 10.2174/1573406415666181218102517.

Computational prediction of subcellular localization.亚细胞定位的计算预测

Methods Mol Biol. 2007;390:429-66. doi: 10.1007/978-1-59745-466-7_29.

Predicting protein subcellular localisation from amino acid sequence information.从氨基酸序列信息预测蛋白质亚细胞定位。

Brief Bioinform. 2002 Dec;3(4):361-76. doi: 10.1093/bib/3.4.361.

Predicting protein subcellular location using Chou's pseudo amino acid composition and improved hybrid approach.利用周氏伪氨基酸组成和改进的混合方法预测蛋白质亚细胞定位。

Protein Pept Lett. 2008;15(6):612-6. doi: 10.2174/092986608784966930.

引用本文的文献

Galectin-9-An Emerging Glyco-Immune Checkpoint Target for Cancer Therapy.半乳糖凝集素-9——一种新兴的癌症治疗糖免疫检查点靶点。

Int J Mol Sci. 2025 Aug 19;26(16):7998. doi: 10.3390/ijms26167998.

Practical Applications of Language Models in Protein Sorting Prediction: SignalP 6.0, DeepLoc 2.1, and DeepLocPro 1.0.语言模型在蛋白质分选预测中的实际应用：SignalP 6.0、DeepLoc 2.1和DeepLocPro 1.0

Methods Mol Biol. 2025;2941:153-175. doi: 10.1007/978-1-0716-4623-6_10.

Reliability of plastid and mitochondrial localisation prediction declines rapidly with the evolutionary distance to the training set increasing.质体和线粒体定位预测的可靠性随着与训练集的进化距离的增加而迅速下降。

PLoS Comput Biol. 2024 Nov 11;20(11):e1012575. doi: 10.1371/journal.pcbi.1012575. eCollection 2024 Nov.

Identification and Expression Analysis of Gene Family in Potato.马铃薯基因家族的鉴定与表达分析。

Genes (Basel). 2024 Jul 2;15(7):870. doi: 10.3390/genes15070870.

Translocation of Antimicrobial Peptides across Model Membranes: The Role of Peptide Chain Length.抗菌肽跨模型膜的转运：肽链长度的作用。

Mol Pharm. 2024 Aug 5;21(8):4082-4097. doi: 10.1021/acs.molpharmaceut.4c00450. Epub 2024 Jul 12.

Altered socio-affective communication and amygdala development in mice with protocadherin10-deficient interneurons.特定型原钙黏蛋白 10 缺失性中间神经元的小鼠社会情感交流改变和杏仁核发育异常。

Open Biol. 2024 Jun;14(6):240113. doi: 10.1098/rsob.240113. Epub 2024 Jun 19.

Fast and exact fixed-radius neighbor search based on sorting.基于排序的快速精确固定半径邻域搜索。

PeerJ Comput Sci. 2024 Mar 29;10:e1929. doi: 10.7717/peerj-cs.1929. eCollection 2024.

A review from biological mapping to computation-based subcellular localization.从生物图谱到基于计算的亚细胞定位的综述。

Mol Ther Nucleic Acids. 2023 Apr 20;32:507-521. doi: 10.1016/j.omtn.2023.04.015. eCollection 2023 Jun 13.

Plasmodium berghei Brca2 is required for normal development and differentiation in mice and mosquitoes.疟原虫伯氏疟原虫 Brca2 是小鼠和蚊子正常发育和分化所必需的。

Parasit Vectors. 2022 Jul 8;15(1):244. doi: 10.1186/s13071-022-05357-w.

Genomic and Experimental Analysis of the Insecticidal Factors Secreted by the Entomopathogenic Fungus RGM 2184.昆虫病原真菌RGM 2184分泌的杀虫因子的基因组和实验分析

J Fungi (Basel). 2022 Mar 1;8(3):253. doi: 10.3390/jof8030253.

本文引用的文献

Three-dimensional structure of membrane and surface proteins.膜蛋白和表面蛋白的三维结构。

Annu Rev Biochem. 1984;53:595-623. doi: 10.1146/annurev.bi.53.070184.003115.

The detection and classification of membrane-spanning proteins.跨膜蛋白的检测与分类。

Biochim Biophys Acta. 1985 May 28;815(3):468-76. doi: 10.1016/0005-2736(85)90375-x.

A new method for predicting signal sequence cleavage sites.一种预测信号序列切割位点的新方法。

Nucleic Acids Res. 1986 Jun 11;14(11):4683-90. doi: 10.1093/nar/14.11.4683.

How proteins get into microbodies (peroxisomes, glyoxysomes, glycosomes).蛋白质如何进入微体（过氧化物酶体、乙醛酸循环体、糖体）。

Biochim Biophys Acta. 1986 May 5;866(4):179-203. doi: 10.1016/0167-4781(86)90044-8.

Biosynthetic protein transport and sorting by the endoplasmic reticulum and Golgi.内质网和高尔基体介导的生物合成蛋白质转运与分选

Annu Rev Biochem. 1987;56:829-52. doi: 10.1146/annurev.bi.56.070187.004145.

Prediction of in-vivo modification sites of proteins from their primary structures.

J Biochem. 1988 Nov;104(5):693-9. doi: 10.1093/oxfordjournals.jbchem.a122535.

Nuclear transport of adenovirus DNA polymerase is facilitated by interaction with preterminal protein.腺病毒DNA聚合酶与末端前体蛋白的相互作用促进了其核转运。

Cell. 1988 Dec 23;55(6):1005-15. doi: 10.1016/0092-8674(88)90245-0.

Topogenic signals in integral membrane proteins.整合膜蛋白中的拓扑信号。

Eur J Biochem. 1988 Jul 1;174(4):671-8. doi: 10.1111/j.1432-1033.1988.tb14150.x.

Fatty acylation of proteins.蛋白质的脂肪酰化作用。

Annu Rev Cell Biol. 1988;4:611-47. doi: 10.1146/annurev.cb.04.110188.003143.

The biology and enzymology of eukaryotic protein acylation.真核生物蛋白质酰化的生物学与酶学

Annu Rev Biochem. 1988;57:69-99. doi: 10.1146/annurev.bi.57.070188.000441.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验