Suppr超能文献

PFP-GO:利用排序后的基因本体(GO)术语整合蛋白质序列、结构域和蛋白质-蛋白质相互作用信息以进行蛋白质功能预测。

PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms.

作者信息

Sengupta Kaustav, Saha Sovan, Halder Anup Kumar, Chatterjee Piyali, Nasipuri Mita, Basu Subhadip, Plewczynski Dariusz

机构信息

Laboratory of Functional and Structural Genomics, Center of New Technologies, University of Warsaw, Warsaw, Poland.

Department of Computer Science and Engineering, Jadavpur University, Kolkata, India.

出版信息

Front Genet. 2022 Sep 29;13:969915. doi: 10.3389/fgene.2022.969915. eCollection 2022.

Abstract

Protein function prediction is gradually emerging as an essential field in biological and computational studies. Though the latter has clinched a significant footprint, it has been observed that the application of computational information gathered from multiple sources has more significant influence than the one derived from a single source. Considering this fact, a methodology, PFP-GO, is proposed where heterogeneous sources like Protein Sequence, Protein Domain, and Protein-Protein Interaction Network have been processed separately for ranking each individual functional GO term. Based on this ranking, GO terms are propagated to the target proteins. While Protein sequence enriches the sequence-based information, Protein Domain and Protein-Protein Interaction Networks embed structural/functional and topological based information, respectively, during the phase of GO ranking. Performance analysis of PFP-GO is also based on Precision, Recall, and F-Score. The same was found to perform reasonably better when compared to the other existing state-of-art. PFP-GO has achieved an overall Precision, Recall, and F-Score of 0.67, 0.58, and 0.62, respectively. Furthermore, we check some of the top-ranked GO terms predicted by PFP-GO through multilayer network propagation that affect the 3D structure of the genome. The complete source code of PFP-GO is freely available at https://sites.google.com/view/pfp-go/.

摘要

蛋白质功能预测正逐渐成为生物学和计算研究中的一个重要领域。尽管计算研究已经取得了显著成果,但人们发现,从多个来源收集的计算信息的应用比从单一来源获得的信息具有更大的影响。考虑到这一事实,提出了一种名为PFP-GO的方法,该方法对蛋白质序列、蛋白质结构域和蛋白质-蛋白质相互作用网络等异构源进行单独处理,以对每个单独的功能基因本体(GO)术语进行排名。基于此排名,将GO术语传播到目标蛋白质。在GO排名阶段,蛋白质序列丰富了基于序列的信息,蛋白质结构域和蛋白质-蛋白质相互作用网络分别嵌入了基于结构/功能和拓扑的信息。PFP-GO的性能分析也基于精确率、召回率和F值。与其他现有最先进的方法相比,该方法表现得相当好。PFP-GO的总体精确率、召回率和F值分别达到了0.67、0.58和0.62。此外,我们通过多层网络传播检查了一些由PFP-GO预测的排名靠前的GO术语,这些术语会影响基因组的三维结构。PFP-GO的完整源代码可在https://sites.google.com/view/pfp-go/上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00d7/9556876/be23e6b7688a/fgene-13-969915-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验