Suppr超能文献

使用Mapper和图卷积网络预测蛋白质相互作用网络中的蛋白质复合物。

Predicting protein complexes in protein interaction networks using Mapper and graph convolution networks.

作者信息

Daou Leonardo, Hanna Eileen Marie

机构信息

Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon.

出版信息

Comput Struct Biotechnol J. 2024 Oct 10;23:3595-3609. doi: 10.1016/j.csbj.2024.10.009. eCollection 2024 Dec.

Abstract

Protein complexes are groups of interacting proteins that are central to multiple biological processes. Studying protein complexes can enhance our understanding of cellular functions and malfunctions and thus support the development of effective disease treatments. High-throughput experimental techniques allow the generation of large-scale protein-protein interaction datasets. Accordingly, various computational approaches to predict protein complexes from protein-protein interactions were presented in the literature. They are typically based on networks in which nodes and edges represent proteins and their interactions, respectively. State-of-the-art approaches mainly rely on clustering static networks to identify complexes. However, since protein interactions are highly dynamic in nature, recent approaches seek to model such dynamics by typically integrating gene expression data and identifying protein complexes accordingly. We propose MComplex, a method that utilizes time-series gene expression with interaction data to generate a temporal network which is passed to a generative adversarial network whose generator is a graph convolutional network. This creates embeddings which are then analyzed using a modified graph-based version of the Mapper algorithm to predict corresponding protein complexes. We test our approach on multiple benchmark datasets and compare identified complexes against gold-standard protein complex datasets. Our results show that MComplex outperforms existing methods in several evaluation aspects, namely recall and maximum matching ratio as well as a composite score covering aggregated evaluation measures. The code and data are available for free download from https://github.com/LeonardoDaou/MComplex.

摘要

蛋白质复合物是相互作用的蛋白质组,对多种生物学过程至关重要。研究蛋白质复合物可以增进我们对细胞功能和功能失调的理解,从而为开发有效的疾病治疗方法提供支持。高通量实验技术能够生成大规模的蛋白质-蛋白质相互作用数据集。相应地,文献中提出了各种从蛋白质-蛋白质相互作用预测蛋白质复合物的计算方法。它们通常基于网络,其中节点和边分别代表蛋白质及其相互作用。最先进的方法主要依靠对静态网络进行聚类来识别复合物。然而,由于蛋白质相互作用本质上是高度动态的,最近的方法试图通过通常整合基因表达数据并据此识别蛋白质复合物来对这种动态进行建模。我们提出了MComplex方法,该方法利用时间序列基因表达和相互作用数据生成一个时间网络,然后将其传递给一个生成对抗网络,其生成器是一个图卷积网络。这会创建嵌入,然后使用基于图的Mapper算法的修改版本对其进行分析,以预测相应的蛋白质复合物。我们在多个基准数据集上测试了我们的方法,并将识别出的复合物与金标准蛋白质复合物数据集进行比较。我们的结果表明,MComplex在几个评估方面优于现有方法,即召回率和最大匹配率以及涵盖综合评估指标的综合得分。代码和数据可从https://github.com/LeonardoDaou/MComplex免费下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9cb3/11530816/13bab559d759/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验