• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

protein2vec:基于 LSTM 的蛋白质-蛋白质相互作用预测。

protein2vec: Predicting Protein-Protein Interactions Based on LSTM.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1257-1266. doi: 10.1109/TCBB.2020.3003941. Epub 2022 Jun 3.

DOI:10.1109/TCBB.2020.3003941
PMID:32750870
Abstract

The semantic similarity of gene ontology (GO) terms is widely used to predict protein-protein interactions (PPIs). The traditional semantic similarity measures are based mainly on manually crafted features, which may ignore some important hidden information of the gene ontology. Moreover, those methods usually obtain the similarity between proteins from similarity between GO terms by some simple statistical rules, such as MAX and BMA (best-match average), oversimplifying the possible complex relationship between the proteins and the GO terms annotated with them. To overcome the two deficiencies, we propose a new method named protein2vec, which characterizes a protein with a vector based on the GO terms annotated to it and combines the information of both the GO and known PPIs. We firstly try to apply the network embedding algorithm on the GO network to generate feature vectors for each GO term. Then, Long Short-Time Memory (LSTM) encodes the feature vectors of the GO terms annotated with a protein into another vector (called protein vector). Finally, two protein vectors are forwarded into a feedforward neural network to predict the interaction between the two corresponding proteins. The experimental results show that protein2vec outperforms almost all commonly used traditional semantic similarity methods.

摘要

GO 术语的语义相似性被广泛用于预测蛋白质-蛋白质相互作用 (PPIs)。传统的语义相似性度量方法主要基于手工制作的特征,这可能忽略了基因本体论中一些重要的隐藏信息。此外,这些方法通常通过一些简单的统计规则(如 MAX 和 BMA(最佳匹配平均))从 GO 术语的相似性来获得蛋白质之间的相似性,从而简化了蛋白质与对其进行注释的 GO 术语之间可能存在的复杂关系。为了克服这两个缺陷,我们提出了一种名为 protein2vec 的新方法,该方法基于对其进行注释的 GO 术语来用向量对蛋白质进行特征化,并结合了 GO 和已知 PPIs 的信息。我们首先尝试将网络嵌入算法应用于 GO 网络,为每个 GO 术语生成特征向量。然后,长短期记忆 (LSTM) 将具有蛋白质注释的 GO 术语的特征向量编码为另一个向量(称为蛋白质向量)。最后,将两个蛋白质向量转发到前馈神经网络中,以预测两个相应蛋白质之间的相互作用。实验结果表明,protein2vec 优于几乎所有常用的传统语义相似性方法。

相似文献

1
protein2vec: Predicting Protein-Protein Interactions Based on LSTM.protein2vec:基于 LSTM 的蛋白质-蛋白质相互作用预测。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1257-1266. doi: 10.1109/TCBB.2020.3003941. Epub 2022 Jun 3.
2
Learning representations for gene ontology terms by jointly encoding graph structure and textual node descriptors.通过联合编码图结构和文本节点描述符来学习基因本体论术语的表示。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac318.
3
Anc2vec: embedding gene ontology terms by preserving ancestors relationships.Anc2vec:通过保留祖先关系来嵌入基因本体论术语。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac003.
4
Correlating information contents of gene ontology terms to infer semantic similarity of gene products.关联基因本体术语的信息内容以推断基因产物的语义相似性。
Comput Math Methods Med. 2014;2014:891842. doi: 10.1155/2014/891842. Epub 2014 May 22.
5
Measure the Semantic Similarity of GO Terms Using Aggregate Information Content.使用聚合信息内容测量基因本体术语的语义相似性。
IEEE/ACM Trans Comput Biol Bioinform. 2014 May-Jun;11(3):468-76. doi: 10.1109/TCBB.2013.176.
6
Word and Sentence Embedding Tools to Measure Semantic Similarity of Gene Ontology Terms by Their Definitions.通过基因本体术语的定义来测量其语义相似性的词和句子嵌入工具。
J Comput Biol. 2019 Jan;26(1):38-52. doi: 10.1089/cmb.2018.0093. Epub 2018 Oct 31.
7
Exploring the relationship between hub proteins and drug targets based on GO and intrinsic disorder.基于基因本体论(GO)和内在无序性探索枢纽蛋白与药物靶点之间的关系。
Comput Biol Chem. 2015 Jun;56:41-8. doi: 10.1016/j.compbiolchem.2015.03.003. Epub 2015 Mar 23.
8
Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph.使用信息内容和基因本体论图的拓扑属性评估蛋白质之间的语义相似性。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):839-849. doi: 10.1109/TCBB.2017.2689762. Epub 2017 Mar 31.
9
TANGO: A GO-Term Embedding Based Method for Protein Semantic Similarity Prediction.TANGO:一种基于GO术语嵌入的蛋白质语义相似性预测方法。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):694-706. doi: 10.1109/TCBB.2022.3143480. Epub 2023 Feb 3.
10
A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。
BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.

引用本文的文献

1
A TRIM Family-Based Strategy for TRIMCIV Target Prediction in a Pan-Cancer Context with Multi-Omics Data and Protein Docking Integration.一种基于TRIM家族的策略,用于在泛癌背景下结合多组学数据和蛋白质对接整合进行TRIMCIV靶点预测。
Biology (Basel). 2025 Jun 22;14(7):742. doi: 10.3390/biology14070742.
2
Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models.蛋白质序列分析全景:任务类型、数据库、数据集、词嵌入方法和语言模型的系统综述
Database (Oxford). 2025 May 30;2025. doi: 10.1093/database/baaf027.
3
Genome language modeling (GLM): a beginner's cheat sheet.
基因组语言建模(GLM):初学者简易指南。
Biol Methods Protoc. 2025 Mar 25;10(1):bpaf022. doi: 10.1093/biomethods/bpaf022. eCollection 2025.
4
RecGOBD: accurate recognition of gene ontology related brain development protein functions through multi-feature fusion and attention mechanisms.RecGOBD:通过多特征融合和注意力机制准确识别与基因本体相关的脑发育蛋白质功能。
Bioinform Adv. 2024 Nov 4;4(1):vbae163. doi: 10.1093/bioadv/vbae163. eCollection 2024.
5
A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond.蛋白质中心预测因子在生物分子相互作用研究中的综合综述:从蛋白质到核酸及其他。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae162.
6
Fusing Sequence and Structural Knowledge by Heterogeneous Models to Accurately and Interpretively Predict Drug-Target Affinity.融合序列和结构知识的异质模型,以准确和有意义地预测药物-靶标亲和力。
Molecules. 2023 Dec 8;28(24):8005. doi: 10.3390/molecules28248005.
7
Recent Advances in Deep Learning for Protein-Protein Interaction Analysis: A Comprehensive Review.深度学习在蛋白质-蛋白质相互作用分析中的最新进展:全面综述。
Molecules. 2023 Jul 2;28(13):5169. doi: 10.3390/molecules28135169.