通过整合蛋白质相互作用网络的多重比对来鉴定蛋白质复合物。

Identification of protein complexes by integrating multiple alignment of protein interaction networks.

作者信息

Ma Cheng-Yu, Chen Yi-Ping Phoebe, Berger Bonnie, Liao Chung-Shou

机构信息

Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan.

Department of Computer Science and Computer Engineering, La Trobe University, Melbourne, Vic, Australia.

出版信息

Bioinformatics. 2017 Jun 1;33(11):1681-1688. doi: 10.1093/bioinformatics/btx043.

DOI:10.1093/bioinformatics/btx043

PMID:28130237

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5860626/

Abstract

MOTIVATION

Protein complexes are one of the keys to studying the behavior of a cell system. Many biological functions are carried out by protein complexes. During the past decade, the main strategy used to identify protein complexes from high-throughput network data has been to extract near-cliques or highly dense subgraphs from a single protein-protein interaction (PPI) network. Although experimental PPI data have increased significantly over recent years, most PPI networks still have many false positive interactions and false negative edge loss due to the limitations of high-throughput experiments. In particular, the false negative errors restrict the search space of such conventional protein complex identification approaches. Thus, it has become one of the most challenging tasks in systems biology to automatically identify protein complexes.

RESULTS

In this study, we propose a new algorithm, NEOComplex ( NE CC- and O rtholog-based Complex identification by multiple network alignment), which integrates functional orthology information that can be obtained from different types of multiple network alignment (MNA) approaches to expand the search space of protein complex detection. As part of our approach, we also define a new edge clustering coefficient (NECC) to assign weights to interaction edges in PPI networks so that protein complexes can be identified more accurately. The NECC is based on the intuition that there is functional information captured in the common neighbors of the common neighbors as well. Our results show that our algorithm outperforms well-known protein complex identification tools in a balance between precision and recall on three eukaryotic species: human, yeast, and fly. As a result of MNAs of the species, the proposed approach can tolerate edge loss in PPI networks and even discover sparse protein complexes which have traditionally been a challenge to predict.

AVAILABILITY AND IMPLEMENTATION

http://acolab.ie.nthu.edu.tw/bionetwork/NEOComplex.

CONTACT

bab@csail.mit.edu or csliao@ie.nthu.edu.tw.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

蛋白质复合物是研究细胞系统行为的关键之一。许多生物学功能是由蛋白质复合物执行的。在过去十年中，从高通量网络数据中识别蛋白质复合物的主要策略是从单个蛋白质 - 蛋白质相互作用（PPI）网络中提取近似团或高度密集的子图。尽管近年来实验性PPI数据显著增加，但由于高通量实验的局限性，大多数PPI网络仍然存在许多假阳性相互作用和假阴性边丢失的情况。特别是，假阴性错误限制了此类传统蛋白质复合物识别方法的搜索空间。因此，自动识别蛋白质复合物已成为系统生物学中最具挑战性的任务之一。

结果

在本研究中，我们提出了一种新算法NEOComplex（基于多网络比对的基于共表达和直系同源的复合物识别），该算法整合了可从不同类型的多网络比对（MNA）方法中获得的功能直系同源信息，以扩展蛋白质复合物检测的搜索空间。作为我们方法的一部分，我们还定义了一种新的边聚类系数（NECC），为PPI网络中的相互作用边分配权重，以便更准确地识别蛋白质复合物。NECC基于这样一种直觉，即共同邻居的共同邻居中也捕获了功能信息。我们的结果表明，我们的算法在人类、酵母和果蝇这三种真核生物物种上，在精度和召回率之间的平衡方面优于著名的蛋白质复合物识别工具。由于对这些物种进行了多网络比对，所提出的方法可以容忍PPI网络中的边丢失，甚至发现传统上难以预测的稀疏蛋白质复合物。

可用性和实现方式

http://acolab.ie.nthu.edu.tw/bionetwork/NEOComplex。

联系方式

bab@csail.mit.edu或csliao@ie.nthu.edu.tw。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

Identification of protein complexes by integrating multiple alignment of protein interaction networks.通过整合蛋白质相互作用网络的多重比对来鉴定蛋白质复合物。

Bioinformatics. 2017 Jun 1;33(11):1681-1688. doi: 10.1093/bioinformatics/btx043.

Prediction of problematic complexes from PPI networks: sparse, embedded, and small complexes.从蛋白质-蛋白质相互作用网络预测有问题的复合物：稀疏、嵌入和小型复合物。

Biol Direct. 2015 Aug 1;10:40. doi: 10.1186/s13062-015-0067-4.

From Function to Interaction: A New Paradigm for Accurately Predicting Protein Complexes Based on Protein-to-Protein Interaction Networks.从功能到相互作用：基于蛋白质-蛋白质相互作用网络准确预测蛋白质复合物的新范式。

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jul-Aug;11(4):616-27. doi: 10.1109/TCBB.2014.2306825.

PrimAlign: PageRank-inspired Markovian alignment for large biological networks.PrimAlign：基于 PageRank 启发的马尔可夫对齐算法，用于大型生物网络。

Bioinformatics. 2018 Jul 1;34(13):i537-i546. doi: 10.1093/bioinformatics/bty288.

Supervised maximum-likelihood weighting of composite protein networks for complex prediction.用于复杂预测的复合蛋白质网络的监督最大似然加权

BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S13. doi: 10.1186/1752-0509-6-S2-S13. Epub 2012 Dec 12.

Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques.利用核心团块的扩展和条纹对大型 PPI 网络进行复杂预测。

Interdiscip Sci. 2023 Sep;15(3):331-348. doi: 10.1007/s12539-022-00541-z. Epub 2022 Oct 28.

A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks.一种用于从多个异构网络中检测蛋白质复合物的多网络聚类方法。

BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):463. doi: 10.1186/s12859-017-1877-4.

Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data.通过整合 PPI 网络和基因表达数据来鉴定蛋白质复合物和功能模块。

BMC Bioinformatics. 2012 May 23;13:109. doi: 10.1186/1471-2105-13-109.

Global alignment of multiple protein interaction networks.多个蛋白质相互作用网络的全局比对

Pac Symp Biocomput. 2008:303-14.

Protein complex prediction in interaction network based on network motif.基于网络基元的互作网络中蛋白质复合物预测。

Comput Biol Chem. 2020 Dec;89:107399. doi: 10.1016/j.compbiolchem.2020.107399. Epub 2020 Oct 9.

引用本文的文献

Improved protein interaction models predict differences in complexes between human cell lines.改进的蛋白质相互作用模型可预测人类细胞系之间复合物的差异。

bioRxiv. 2024 Oct 25:2024.10.25.620244. doi: 10.1101/2024.10.25.620244.

Integration of protein sequence and protein-protein interaction data by hypergraph learning to identify novel protein complexes.通过超图学习整合蛋白质序列和蛋白质-蛋白质相互作用数据，以识别新的蛋白质复合物。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae274.

DIAMIN: a software library for the distributed analysis of large-scale molecular interaction networks.DIAMIN：用于大规模分子相互作用网络分布式分析的软件库。

BMC Bioinformatics. 2022 Nov 11;23(1):474. doi: 10.1186/s12859-022-05026-w.

Detecting protein complexes with multiple properties by an adaptive harmony search algorithm.采用自适应和声搜索算法探测具有多种特性的蛋白质复合物。

BMC Bioinformatics. 2022 Oct 7;23(1):414. doi: 10.1186/s12859-022-04923-4.

An Ensemble Learning Framework for Detecting Protein Complexes From PPI Networks.一种用于从蛋白质-蛋白质相互作用网络中检测蛋白质复合物的集成学习框架。

Front Genet. 2022 Feb 24;13:839949. doi: 10.3389/fgene.2022.839949. eCollection 2022.

An Improved Memetic Algorithm for Detecting Protein Complexes in Protein Interaction Networks.一种用于检测蛋白质相互作用网络中蛋白质复合物的改进型 Memetic 算法。

Front Genet. 2021 Dec 14;12:794354. doi: 10.3389/fgene.2021.794354. eCollection 2021.

iMPTCE-Hnetwork: A Multilabel Classifier for Identifying Metabolic Pathway Types of Chemicals and Enzymes with a Heterogeneous Network.iMPTCE-Hnetwork：一种基于异构网络的用于识别化学物质和酶代谢途径类型的多标签分类器。

Comput Math Methods Med. 2021 Jan 4;2021:6683051. doi: 10.1155/2021/6683051. eCollection 2021.

A review of protein-protein interaction network alignment: From pathway comparison to global alignment.蛋白质-蛋白质相互作用网络比对综述：从通路比较到全局比对

Comput Struct Biotechnol J. 2020 Sep 18;18:2647-2656. doi: 10.1016/j.csbj.2020.09.011. eCollection 2020.

Classification in biological networks with hypergraphlet kernels.基于超图节点核的生物网络分类。

Bioinformatics. 2021 May 17;37(7):1000-1007. doi: 10.1093/bioinformatics/btaa768.

KSP: an integrated method for predicting catalyzing kinases of phosphorylation sites in proteins.KSP：一种预测蛋白质磷酸化位点催化激酶的综合方法。

BMC Genomics. 2020 Aug 4;21(1):537. doi: 10.1186/s12864-020-06895-2.

本文引用的文献

Compact Integration of Multi-Network Topology for Functional Analysis of Genes.用于基因功能分析的多网络拓扑结构的紧凑集成

Cell Syst. 2016 Dec 21;3(6):540-548.e5. doi: 10.1016/j.cels.2016.10.017. Epub 2016 Nov 23.

Fundamentals of protein interaction network mapping.蛋白质相互作用网络图谱的基础

Mol Syst Biol. 2015 Dec 17;11(12):848. doi: 10.15252/msb.20156351.

Identification of Protein Complexes Using Weighted PageRank-Nibble Algorithm and Core-Attachment Structure.使用加权PageRank-Nibble算法和核心-附属结构识别蛋白质复合物

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):179-92. doi: 10.1109/TCBB.2014.2343954.

Widespread macromolecular interaction perturbations in human genetic disorders.人类遗传疾病中广泛存在的大分子相互作用扰动。

Cell. 2015 Apr 23;161(3):647-660. doi: 10.1016/j.cell.2015.04.013.

Topology-function conservation in protein-protein interaction networks.蛋白质-蛋白质相互作用网络中的拓扑结构-功能保守性

Bioinformatics. 2015 May 15;31(10):1632-9. doi: 10.1093/bioinformatics/btv026. Epub 2015 Jan 20.

A quantitative chaperone interaction network reveals the architecture of cellular protein homeostasis pathways.定量伴侣蛋白相互作用网络揭示了细胞蛋白动态平衡途径的结构。

Cell. 2014 Jul 17;158(2):434-448. doi: 10.1016/j.cell.2014.05.039.

Determining effects of non-synonymous SNPs on protein-protein interactions using supervised and semi-supervised learning.使用监督学习和半监督学习确定非同义单核苷酸多态性对蛋白质-蛋白质相互作用的影响。

PLoS Comput Biol. 2014 May 1;10(5):e1003592. doi: 10.1371/journal.pcbi.1003592. eCollection 2014 May.

Analysis of protein-protein interactions using LUMIER assays.使用发光相互作用分子实验分析蛋白质-蛋白质相互作用。

Methods Mol Biol. 2013;1064:17-27. doi: 10.1007/978-1-62703-601-6_2.

Computational solutions for omics data.计算方法在组学数据中的应用。

Nat Rev Genet. 2013 May;14(5):333-46. doi: 10.1038/nrg3433.

Protein complex-based analysis framework for high-throughput data sets.基于蛋白质复合物的高通量数据集分析框架。

Sci Signal. 2013 Feb 26;6(264):rs5. doi: 10.1126/scisignal.2003629.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验