Suppr超能文献

基于子序列的多注意力多方面网络的蛋白质功能预测方法。

A Sub-Sequence Based Approach to Protein Function Prediction via Multi-Attention Based Multi-Aspect Network.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):94-105. doi: 10.1109/TCBB.2021.3130923. Epub 2023 Feb 3.

Abstract

Inferring the protein function(s) via the protein sub-sequence classification is often obstructed due to lack of knowledge about function(s) of sub-sequences in the protein sequence. In this regard, we develop a novel "multi-aspect" paradigm to perform the sub-sequence classification in an efficient way by utilizing the information of the parent sequence. The aspects are: (1) Multi-label: independent labelling of sub-sequences with more than one functions of the parent sequence, and (ii) Label-relevance: scoring the parent functions to highlight the relevance of performing a given function by the sub-sequence. The multi-aspect paradigm is used to propose the "Multi-Attention Based Multi-Aspect Network" for classifying the protein sub-sequences, where multi-attention is a novel approach to process sub-sequences at word-level. Next, the proposed Global-ProtEnc method is a sub-sequence based approach to encoding protein sequences for protein function prediction task, which is finally used to develop as ensemble methods, Global-ProtEnc-Plus. Evaluations of both the Global-ProtEnc and the Global-ProtEnc-Plus methods on the benchmark CAFA3 dataset delivered a outstanding performances. Compared to the state-of-the-art DeepGOPlus, the improvements in F with the Global-ProtEnc-Plus for the biological process is +6.50 percent and cellular component is +1.90 percent.

摘要

通过蛋白质子序列分类来推断蛋白质的功能通常会受到蛋白质序列中对子序列功能缺乏了解的阻碍。在这方面,我们开发了一种新的“多方面”范式,通过利用父序列的信息,以有效的方式进行子序列分类。这些方面包括:(1)多标签:对子序列进行独立的多标签标注,标注的标签来自父序列的多个功能;(2)标签相关性:对父序列的功能进行评分,突出子序列执行给定功能的相关性。多方面范式用于提出“基于多注意的多方面网络”,以对蛋白质子序列进行分类,其中多注意是一种处理子序列的新颖方法,可在单词级别进行处理。接下来,所提出的 Global-ProtEnc 方法是一种基于子序列的方法,用于对蛋白质序列进行编码,以进行蛋白质功能预测任务,最终用于开发成集成方法 Global-ProtEnc-Plus。在 CAFA3 基准数据集上对 Global-ProtEnc 和 Global-ProtEnc-Plus 方法的评估都取得了出色的性能。与最先进的 DeepGOPlus 相比,Global-ProtEnc-Plus 在生物学过程方面的 F 值提高了+6.50%,在细胞成分方面提高了+1.90%。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验