Suppr超能文献

MEGA-GO:使用多尺度图自适应神经网络预测不同蛋白质序列长度的功能

MEGA-GO: functions prediction of diverse protein sequence length using Multi-scalE Graph Adaptive neural network.

作者信息

Lee Yujian, Gao Peng, Xu Yongqi, Wang Ziyang, Li Shuaicheng, Chen Jiaxing

机构信息

Guangdong Provincial Key Laboratory IRADS, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai 519087, China.

Department of Computer Science, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai 519087, China.

出版信息

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf032.

Abstract

MOTIVATION

The increasing accessibility of large-scale protein sequences through advanced sequencing technologies has necessitated the development of efficient and accurate methods for predicting protein function. Computational prediction models have emerged as a promising solution to expedite the annotation process. However, despite making significant progress in protein research, graph neural networks face challenges in capturing long-range structural correlations and identifying critical residues in protein graphs. Furthermore, existing models have limitations in effectively predicting the function of newly sequenced proteins that are not included in protein interaction networks. This highlights the need for novel approaches integrating protein structure and sequence data.

RESULTS

We introduce Multi-scalE Graph Adaptive neural network (MEGA-GO), highlighting the capability of capturing diverse protein sequence length features from multiple scales. The unique graph adaptive neural network architecture of MEGA-GO enables a more nuanced extraction of graph structure features, effectively capturing intricate relationships within biological data. Experimental results demonstrate that MEGA-GO outperforms mainstream protein function prediction models in the accuracy of Gene Ontology term classification, yielding 33.4%, 68.9%, and 44.6% of area under the precision-recall curve on biological process, molecular function, and cellular component domains, respectively. The rest of the experimental results reveal that our model consistently surpasses the state-of-the-art methods.

AVAILABILITY AND IMPLEMENTATION

The source code and data of MEGA-GO are available at https://github.com/Cheliosoops/MEGA-GO.

摘要

动机

先进的测序技术使得大规模蛋白质序列越来越容易获取,这就需要开发高效且准确的蛋白质功能预测方法。计算预测模型已成为加速注释过程的一种有前景的解决方案。然而,尽管在蛋白质研究方面取得了重大进展,但图神经网络在捕捉长程结构相关性以及识别蛋白质图中的关键残基方面仍面临挑战。此外,现有模型在有效预测蛋白质相互作用网络中未包含的新测序蛋白质的功能方面存在局限性。这凸显了整合蛋白质结构和序列数据的新方法的必要性。

结果

我们引入了多尺度图自适应神经网络(MEGA-GO),突出了其从多个尺度捕捉不同蛋白质序列长度特征的能力。MEGA-GO独特的图自适应神经网络架构能够更细致地提取图结构特征,有效地捕捉生物数据中的复杂关系。实验结果表明,MEGA-GO在基因本体术语分类的准确性方面优于主流蛋白质功能预测模型,在生物过程、分子功能和细胞组分领域的精确召回率曲线下面积分别为33.4%、68.9%和44.6%。其余实验结果表明,我们的模型始终超越了当前的先进方法。

可用性和实现方式

MEGA-GO的源代码和数据可在https://github.com/Cheliosoops/MEGA-GO获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/ed20d4ae01e0/btaf032f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验