• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于子序列的多注意力多方面网络的蛋白质功能预测方法。

A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):94-105. doi: 10.1109/TCBB.2021.3130923. Epub 2023 Feb 3.

DOI:10.1109/TCBB.2021.3130923
PMID:34826296
Abstract

Inferring the protein function(s) via the protein sub-sequence classification is often obstructed due to lack of knowledge about function(s) of sub-sequences in the protein sequence. In this regard, we develop a novel "multi-aspect" paradigm to perform the sub-sequence classification in an efficient way by utilizing the information of the parent sequence. The aspects are: (1) Multi-label: independent labelling of sub-sequences with more than one functions of the parent sequence, and (ii) Label-relevance: scoring the parent functions to highlight the relevance of performing a given function by the sub-sequence. The multi-aspect paradigm is used to propose the "Multi-Attention Based Multi-Aspect Network" for classifying the protein sub-sequences, where multi-attention is a novel approach to process sub-sequences at word-level. Next, the proposed Global-ProtEnc method is a sub-sequence based approach to encoding protein sequences for protein function prediction task, which is finally used to develop as ensemble methods, Global-ProtEnc-Plus. Evaluations of both the Global-ProtEnc and the Global-ProtEnc-Plus methods on the benchmark CAFA3 dataset delivered a outstanding performances. Compared to the state-of-the-art DeepGOPlus, the improvements in F with the Global-ProtEnc-Plus for the biological process is +6.50 percent and cellular component is +1.90 percent.

摘要

通过蛋白质子序列分类来推断蛋白质的功能通常会受到蛋白质序列中对子序列功能缺乏了解的阻碍。在这方面,我们开发了一种新的“多方面”范式,通过利用父序列的信息,以有效的方式进行子序列分类。这些方面包括:(1)多标签:对子序列进行独立的多标签标注,标注的标签来自父序列的多个功能;(2)标签相关性:对父序列的功能进行评分,突出子序列执行给定功能的相关性。多方面范式用于提出“基于多注意的多方面网络”,以对蛋白质子序列进行分类,其中多注意是一种处理子序列的新颖方法,可在单词级别进行处理。接下来,所提出的 Global-ProtEnc 方法是一种基于子序列的方法,用于对蛋白质序列进行编码,以进行蛋白质功能预测任务,最终用于开发成集成方法 Global-ProtEnc-Plus。在 CAFA3 基准数据集上对 Global-ProtEnc 和 Global-ProtEnc-Plus 方法的评估都取得了出色的性能。与最先进的 DeepGOPlus 相比,Global-ProtEnc-Plus 在生物学过程方面的 F 值提高了+6.50%,在细胞成分方面提高了+1.90%。

相似文献

1
A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network.基于子序列的多注意力多方面网络的蛋白质功能预测方法。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):94-105. doi: 10.1109/TCBB.2021.3130923. Epub 2023 Feb 3.
2
Lite-SeqCNN: A Light-Weight Deep CNN Architecture for Protein Function Prediction.Lite-SeqCNN:用于蛋白质功能预测的轻量级深度卷积神经网络架构。
IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2242-2253. doi: 10.1109/TCBB.2023.3240169. Epub 2023 Jun 5.
3
DeepGOPlus: improved protein function prediction from sequence.DeepGOPlus:从序列中改进蛋白质功能预测。
Bioinformatics. 2020 Jan 15;36(2):422-429. doi: 10.1093/bioinformatics/btz595.
4
MCWS-Transformers: Towards an Efficient Modeling of Protein Sequences via Multi Context-Window Based Scaled Self-Attention.MCWS-Transformer:通过基于多上下文窗口的缩放自注意力实现蛋白质序列的高效建模
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1188-1199. doi: 10.1109/TCBB.2022.3173789. Epub 2023 Apr 3.
5
An Ensemble Tf-Idf Based Approach to Protein Function Prediction via Sequence Segmentation.一种基于集成词频-逆文档频率的通过序列分割进行蛋白质功能预测的方法。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Sep-Oct;19(5):2685-2696. doi: 10.1109/TCBB.2021.3093060. Epub 2022 Oct 10.
6
Reduction strategies for hierarchical multi-label classification in protein function prediction.蛋白质功能预测中分层多标签分类的归约策略
BMC Bioinformatics. 2016 Sep 15;17(1):373. doi: 10.1186/s12859-016-1232-1.
7
Deep Robust Framework for Protein Function Prediction Using Variable-Length Protein Sequences.使用可变长蛋白质序列的蛋白质功能预测深度稳健框架。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1648-1659. doi: 10.1109/TCBB.2019.2911609. Epub 2019 Apr 16.
8
MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping.MetaGO:通过低分辨率蛋白质结构预测和蛋白质-蛋白质网络映射预测非同源蛋白质的基因本体论。
J Mol Biol. 2018 Jul 20;430(15):2256-2265. doi: 10.1016/j.jmb.2018.03.004. Epub 2018 Mar 10.
9
SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.支持向量机折叠法:一种用于判别式多类别蛋白质折叠和超家族识别的工具。
BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.
10
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.

引用本文的文献

1
GOBoost: leveraging long-tail gene ontology terms for accurate protein function prediction.GOBoost:利用长尾基因本体术语进行准确的蛋白质功能预测。
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf267.
2
A multimodal model for protein function prediction.一种用于蛋白质功能预测的多模态模型。
Sci Rep. 2025 Mar 26;15(1):10465. doi: 10.1038/s41598-025-94612-y.