Suppr
超能文献

通过超图学习整合蛋白质序列和蛋白质-蛋白质相互作用数据，以识别新的蛋白质复合物。

Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes.

机构信息

School of Basic Medical Sciences, Anhui Medical University, 81 Meishan Road, Shushan District, Hefei 230032, China.

State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, 38 Life Science Park, Changping District, Beijing 102206, China.

出版信息

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae274.

DOI:10.1093/bib/bbae274

PMID:38851299

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11162299/

Abstract

Protein-protein interactions (PPIs) are the basis of many important biological processes, with protein complexes being the key forms implementing these interactions. Understanding protein complexes and their functions is critical for elucidating mechanisms of life processes, disease diagnosis and treatment and drug development. However, experimental methods for identifying protein complexes have many limitations. Therefore, it is necessary to use computational methods to predict protein complexes. Protein sequences can indicate the structure and biological functions of proteins, while also determining their binding abilities with other proteins, influencing the formation of protein complexes. Integrating these characteristics to predict protein complexes is very promising, but currently there is no effective framework that can utilize both protein sequence and PPI network topology for complex prediction. To address this challenge, we have developed HyperGraphComplex, a method based on hypergraph variational autoencoder that can capture expressive features from protein sequences without feature engineering, while also considering topological properties in PPI networks, to predict protein complexes. Experiment results demonstrated that HyperGraphComplex achieves satisfactory predictive performance when compared with state-of-art methods. Further bioinformatics analysis shows that the predicted protein complexes have similar attributes to known ones. Moreover, case studies corroborated the remarkable predictive capability of our model in identifying protein complexes, including 3 that were not only experimentally validated by recent studies but also exhibited high-confidence structural predictions from AlphaFold-Multimer. We believe that the HyperGraphComplex algorithm and our provided proteome-wide high-confidence protein complex prediction dataset will help elucidate how proteins regulate cellular processes in the form of complexes, and facilitate disease diagnosis and treatment and drug development. Source codes are available at https://github.com/LiDlab/HyperGraphComplex.

摘要

蛋白质-蛋白质相互作用（PPIs）是许多重要生物过程的基础，而蛋白质复合物则是实现这些相互作用的关键形式。理解蛋白质复合物及其功能对于阐明生命过程的机制、疾病诊断和治疗以及药物开发至关重要。然而，鉴定蛋白质复合物的实验方法存在许多局限性。因此，有必要使用计算方法来预测蛋白质复合物。蛋白质序列可以指示蛋白质的结构和生物学功能，同时也决定了它们与其他蛋白质的结合能力，影响蛋白质复合物的形成。整合这些特性来预测蛋白质复合物非常有前途，但目前还没有有效的框架可以利用蛋白质序列和 PPI 网络拓扑结构来进行复合物预测。为了解决这个挑战，我们开发了 HyperGraphComplex，这是一种基于超图变分自动编码器的方法，可以从蛋白质序列中捕获表现力强的特征，而无需进行特征工程，同时还考虑了 PPI 网络中的拓扑性质，以预测蛋白质复合物。实验结果表明，与最先进的方法相比，HyperGraphComplex 实现了令人满意的预测性能。进一步的生物信息学分析表明，预测的蛋白质复合物与已知的蛋白质复合物具有相似的属性。此外，案例研究证实了我们的模型在识别蛋白质复合物方面的出色预测能力，包括 3 个不仅被最近的研究实验验证，而且还展示了来自 AlphaFold-Multimer 的高置信度结构预测。我们相信，HyperGraphComplex 算法和我们提供的全蛋白质组高置信度蛋白质复合物预测数据集将有助于阐明蛋白质如何以复合物的形式调节细胞过程，并促进疾病诊断和治疗以及药物开发。源代码可在 https://github.com/LiDlab/HyperGraphComplex 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c1b/11162299/999b752d7e49/bbae274f1.jpg

相似文献

Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes.

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae274.

Improving protein-protein interaction prediction using protein language model and protein network features.

Anal Biochem. 2024 Oct;693:115550. doi: 10.1016/j.ab.2024.115550. Epub 2024 Apr 26.

DSSGNN-PPI: A Protein-Protein Interactions prediction model based on Double Structure and Sequence graph neural networks.

Comput Biol Med. 2024 Jul;177:108669. doi: 10.1016/j.compbiomed.2024.108669. Epub 2024 May 29.

Prediction of Protein-Protein Interaction via co-occurring Aligned Pattern Clusters.

Methods. 2016 Nov 1;110:26-34. doi: 10.1016/j.ymeth.2016.07.018. Epub 2016 Jul 27.

Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.

PLoS One. 2015 May 6;10(5):e0125811. doi: 10.1371/journal.pone.0125811. eCollection 2015.

Amalgamation of 3D structure and sequence information for protein-protein interaction prediction.

Sci Rep. 2020 Nov 5;10(1):19171. doi: 10.1038/s41598-020-75467-x.

Identification of protein complexes from multi-relationship protein interaction networks.

Hum Genomics. 2016 Jul 25;10 Suppl 2(Suppl 2):17. doi: 10.1186/s40246-016-0069-z.

A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks.

BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):463. doi: 10.1186/s12859-017-1877-4.

Protein-Protein Interaction Prediction via Structure-Based Deep Learning.

Proteins. 2024 Nov;92(11):1287-1296. doi: 10.1002/prot.26721. Epub 2024 Jun 23.

SiPAN: simultaneous prediction and alignment of protein-protein interaction networks.

Bioinformatics. 2015 Jul 15;31(14):2356-63. doi: 10.1093/bioinformatics/btv160. Epub 2015 Mar 18.

引用本文的文献

Multimeric protein interaction and complex prediction: Structure, dynamics and function.

Comput Struct Biotechnol J. 2025 May 16;27:1975-1997. doi: 10.1016/j.csbj.2025.05.009. eCollection 2025.

本文引用的文献

Role of the small protein Mco6 in the mitochondrial sorting and assembly machinery.

Cell Rep. 2024 Mar 26;43(3):113805. doi: 10.1016/j.celrep.2024.113805. Epub 2024 Feb 19.

The social and structural architecture of the yeast protein interactome.

Nature. 2023 Dec;624(7990):192-200. doi: 10.1038/s41586-023-06739-5. Epub 2023 Nov 15.

AlphaFold-Multimer predicts cross-kingdom interactions at the plant-pathogen interface.

Nat Commun. 2023 Sep 27;14(1):6040. doi: 10.1038/s41467-023-41721-9.

The yeast RNA methylation complex consists of conserved yet reconfigured components with m6A-dependent and independent roles.

Elife. 2023 Jul 25;12:RP87860. doi: 10.7554/eLife.87860.

Evaluation of AlphaFold-Multimer prediction on multi-chain protein complexes.

Bioinformatics. 2023 Jul 1;39(7). doi: 10.1093/bioinformatics/btad424.

Reconstructing the cell-cell interaction network among mouse immune cells.

Biotechnol Bioeng. 2023 Sep;120(9):2756-2764. doi: 10.1002/bit.28431. Epub 2023 May 25.

Identification of Protein Complexes by Integrating Protein Abundance and Interaction Features Using a Deep Learning Strategy.

Int J Mol Sci. 2023 Apr 26;24(9):7884. doi: 10.3390/ijms24097884.

Vir1p, the yeast homolog of virilizer, is required for mRNA m6A methylation and meiosis.

Genetics. 2023 May 4;224(1). doi: 10.1093/genetics/iyad043.

AlphaFold2 and its applications in the fields of biology and medicine.

Signal Transduct Target Ther. 2023 Mar 14;8(1):115. doi: 10.1038/s41392-023-01381-z.

The Gene Ontology knowledgebase in 2023.

Genetics. 2023 May 4;224(1). doi: 10.1093/genetics/iyad031.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

通过超图学习整合蛋白质序列和蛋白质-蛋白质相互作用数据，以识别新的蛋白质复合物。

Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译