Zhang Shugang, Li Yuntong, Ma Wenjian, Cai Qing, Qin Jing, Bi Xiangpeng, Jiang Huasen, Huang Xiaoyu, Wei Zhiqiang
College of Computer Science and Technology, Ocean University of China, Qingdao, China.
College of Education, Qingdao Hengxing University of Science and Technology, Qingdao, China.
PLoS Comput Biol. 2025 Aug 1;21(8):e1013343. doi: 10.1371/journal.pcbi.1013343. eCollection 2025 Aug.
Understanding the functions of proteins is of great importance for deciphering the mechanisms of life activities. To date, there have been over 200 million known proteins, but only 0.2% of them have well-annotated functional terms. By measuring the contacts among residues, proteins can be described as graphs so that the graph leaning approaches can be applied to learn protein representations. However, existing graph-based methods put efforts in enriching the residue node information and did not fully exploit the edge information, which leads to suboptimal representations considering the strong association of residue contacts to protein structures and to the functions. In this article, we propose SuperEdgeGO, which introduces the supervision of edges in protein graphs to learn a better graph representation for protein function prediction. Different from common graph convolution methods that uses edge information in a plain or unsupervised way, we introduce a supervised attention to encode the residue contacts explicitly into the protein representation. Comprehensive experiments demonstrate that SuperEdgeGO achieves state-of-the-art performance on all three categories of protein functions. Additional ablation analysis further proves the effectiveness of the devised edge supervision strategy. The implementation of edge supervision in SuperEdgeGO resulted in enhanced graph representations for protein function prediction, as demonstrated by its superior performance across all the evaluated categories. This superior performance was confirmed through ablation analysis, which validated the effectiveness of the edge supervision strategy. This strategy has a broad application prospect in the study of protein function and related fields.
了解蛋白质的功能对于破译生命活动机制至关重要。迄今为止,已知的蛋白质已超过2亿种,但其中只有0.2%具有注释完善的功能术语。通过测量残基之间的接触,可以将蛋白质描述为图,从而可以应用图学习方法来学习蛋白质表示。然而,现有的基于图的方法致力于丰富残基节点信息,而没有充分利用边信息,考虑到残基接触与蛋白质结构和功能的强关联,这导致了次优表示。在本文中,我们提出了SuperEdgeGO,它引入了蛋白质图中边的监督,以学习用于蛋白质功能预测的更好的图表示。与以简单或无监督方式使用边信息的普通图卷积方法不同,我们引入了一种监督注意力,将残基接触明确编码到蛋白质表示中。综合实验表明,SuperEdgeGO在所有三类蛋白质功能上均取得了领先的性能。额外的消融分析进一步证明了所设计的边监督策略的有效性。SuperEdgeGO中边监督的实现导致了用于蛋白质功能预测的增强图表示,这在所有评估类别中其卓越性能得到了证明。通过消融分析证实了这种卓越性能,验证了边监督策略的有效性。该策略在蛋白质功能及相关领域的研究中具有广阔的应用前景。