• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MEGA-GO:使用多尺度图自适应神经网络预测不同蛋白质序列长度的功能

MEGA-GO: functions prediction of diverse protein sequence length using Multi-scalE Graph Adaptive neural network.

作者信息

Lee Yujian, Gao Peng, Xu Yongqi, Wang Ziyang, Li Shuaicheng, Chen Jiaxing

机构信息

Guangdong Provincial Key Laboratory IRADS, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai 519087, China.

Department of Computer Science, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai 519087, China.

出版信息

Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf032.

DOI:10.1093/bioinformatics/btaf032
PMID:39847542
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11810639/
Abstract

MOTIVATION

The increasing accessibility of large-scale protein sequences through advanced sequencing technologies has necessitated the development of efficient and accurate methods for predicting protein function. Computational prediction models have emerged as a promising solution to expedite the annotation process. However, despite making significant progress in protein research, graph neural networks face challenges in capturing long-range structural correlations and identifying critical residues in protein graphs. Furthermore, existing models have limitations in effectively predicting the function of newly sequenced proteins that are not included in protein interaction networks. This highlights the need for novel approaches integrating protein structure and sequence data.

RESULTS

We introduce Multi-scalE Graph Adaptive neural network (MEGA-GO), highlighting the capability of capturing diverse protein sequence length features from multiple scales. The unique graph adaptive neural network architecture of MEGA-GO enables a more nuanced extraction of graph structure features, effectively capturing intricate relationships within biological data. Experimental results demonstrate that MEGA-GO outperforms mainstream protein function prediction models in the accuracy of Gene Ontology term classification, yielding 33.4%, 68.9%, and 44.6% of area under the precision-recall curve on biological process, molecular function, and cellular component domains, respectively. The rest of the experimental results reveal that our model consistently surpasses the state-of-the-art methods.

AVAILABILITY AND IMPLEMENTATION

The source code and data of MEGA-GO are available at https://github.com/Cheliosoops/MEGA-GO.

摘要

动机

先进的测序技术使得大规模蛋白质序列越来越容易获取,这就需要开发高效且准确的蛋白质功能预测方法。计算预测模型已成为加速注释过程的一种有前景的解决方案。然而,尽管在蛋白质研究方面取得了重大进展,但图神经网络在捕捉长程结构相关性以及识别蛋白质图中的关键残基方面仍面临挑战。此外,现有模型在有效预测蛋白质相互作用网络中未包含的新测序蛋白质的功能方面存在局限性。这凸显了整合蛋白质结构和序列数据的新方法的必要性。

结果

我们引入了多尺度图自适应神经网络(MEGA-GO),突出了其从多个尺度捕捉不同蛋白质序列长度特征的能力。MEGA-GO独特的图自适应神经网络架构能够更细致地提取图结构特征,有效地捕捉生物数据中的复杂关系。实验结果表明,MEGA-GO在基因本体术语分类的准确性方面优于主流蛋白质功能预测模型,在生物过程、分子功能和细胞组分领域的精确召回率曲线下面积分别为33.4%、68.9%和44.6%。其余实验结果表明,我们的模型始终超越了当前的先进方法。

可用性和实现方式

MEGA-GO的源代码和数据可在https://github.com/Cheliosoops/MEGA-GO获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/e7a0cff6d5bf/btaf032f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/ed20d4ae01e0/btaf032f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/9f334ab9120c/btaf032f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/5a4f7814fa9a/btaf032f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/e7a0cff6d5bf/btaf032f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/ed20d4ae01e0/btaf032f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/9f334ab9120c/btaf032f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/5a4f7814fa9a/btaf032f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c2e/11810639/e7a0cff6d5bf/btaf032f4.jpg

相似文献

1
MEGA-GO: functions prediction of diverse protein sequence length using Multi-scalE Graph Adaptive neural network.MEGA-GO:使用多尺度图自适应神经网络预测不同蛋白质序列长度的功能
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf032.
2
GGN-GO: geometric graph networks for predicting protein function by multi-scale structure features.GGN-GO:基于多尺度结构特征预测蛋白质功能的几何图网络。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae559.
3
DeepSS2GO: protein function prediction from secondary structure.DeepSS2GO:基于二级结构的蛋白质功能预测
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae196.
4
TAWFN: a deep learning framework for protein function prediction.TAWFN:一种用于蛋白质功能预测的深度学习框架。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae571.
5
Deep learning model for protein multi-label subcellular localization and function prediction based on multi-task collaborative training.基于多任务协同训练的蛋白质多标签亚细胞定位和功能预测深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae568.
6
An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.一种用于准确预测蛋白质-蛋白质相互作用的端到端知识图谱融合图神经网络
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2518-2530. doi: 10.1109/TCBB.2024.3486216. Epub 2024 Dec 10.
7
PCP-GC-LM: single-sequence-based protein contact prediction using dual graph convolutional neural network and convolutional neural network.PCP-GC-LM:基于双图卷积神经网络和卷积神经网络的单序列蛋白质接触预测。
BMC Bioinformatics. 2024 Sep 2;25(1):287. doi: 10.1186/s12859-024-05914-3.
8
GATSol, an enhanced predictor of protein solubility through the synergy of 3D structure graph and large language modeling.GATSol,一种通过 3D 结构图和大型语言模型协同作用增强蛋白质可溶性预测的方法。
BMC Bioinformatics. 2024 Jun 1;25(1):204. doi: 10.1186/s12859-024-05820-8.
9
DSSGNN-PPI: A Protein-Protein Interactions prediction model based on Double Structure and Sequence graph neural networks.DSSGNN-PPI:一种基于双结构和序列图神经网络的蛋白质-蛋白质相互作用预测模型。
Comput Biol Med. 2024 Jul;177:108669. doi: 10.1016/j.compbiomed.2024.108669. Epub 2024 May 29.
10
DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier.DeepGO:使用深度本体感知分类器从序列和相互作用预测蛋白质功能。
Bioinformatics. 2018 Feb 15;34(4):660-668. doi: 10.1093/bioinformatics/btx624.

本文引用的文献

1
Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels.用于训练带有噪声标签的深度神经网络的广义交叉熵损失
Adv Neural Inf Process Syst. 2018 Dec;32:8792-8802. Epub 2018 Dec 3.
2
TENET: Triple-enhancement based graph neural network for cell-cell interaction network reconstruction from spatial transcriptomics.TENET:基于三重增强的图神经网络,用于从空间转录组学重建细胞-细胞相互作用网络。
J Mol Biol. 2024 May 1;436(9):168543. doi: 10.1016/j.jmb.2024.168543. Epub 2024 Mar 18.
3
Struct2GO: protein function prediction based on graph pooling algorithm and AlphaFold2 structure information.
Struct2GO:基于图池化算法和 AlphaFold2 结构信息的蛋白质功能预测。
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad637.
4
Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function.将蛋白质序列和结构与转换器和等变图神经网络相结合,以预测蛋白质功能。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i318-i325. doi: 10.1093/bioinformatics/btad208.
5
Hierarchical graph transformer with contrastive learning for protein function prediction.基于对比学习的层次图转换器在蛋白质功能预测中的应用。
Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad410.
6
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.AlphaFold 蛋白质结构数据库:用高精度模型极大地扩展蛋白质序列空间的结构覆盖范围。
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. doi: 10.1093/nar/gkab1061.
7
Hierarchical Pooling in Graph Neural Networks to Enhance Classification Performance in Large Datasets.图神经网络中的层次池化以提高大数据集的分类性能。
Sensors (Basel). 2021 Sep 10;21(18):6070. doi: 10.3390/s21186070.
8
Structure-based protein function prediction using graph convolutional networks.基于结构的蛋白质功能预测使用图卷积网络。
Nat Commun. 2021 May 26;12(1):3168. doi: 10.1038/s41467-021-23303-9.
9
Molecular mechanisms and cellular functions of cGAS-STING signalling.cGAS-STING 信号转导的分子机制和细胞功能。
Nat Rev Mol Cell Biol. 2020 Sep;21(9):501-521. doi: 10.1038/s41580-020-0244-x. Epub 2020 May 18.
10
Modeling aspects of the language of life through transfer-learning protein sequences.通过转移学习蛋白质序列来模拟生命语言的各个方面。
BMC Bioinformatics. 2019 Dec 17;20(1):723. doi: 10.1186/s12859-019-3220-8.