Suppr超能文献

基于多阶段核扩展的高效蛋白质复合物挖掘算法。

An efficient protein complex mining algorithm based on Multistage Kernel Extension.

出版信息

BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S7. doi: 10.1186/1471-2105-15-S12-S7. Epub 2014 Nov 6.

Abstract

BACKGROUND

In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.

METHODS

Inspired by the formation process of cliques of the complex social network and the centrality-lethality rule, we propose a new protein complex mining algorithm called Multistage Kernel Extension (MKE) algorithm, integrating the idea of critical proteins recognition in the Protein- Protein Interaction (PPI) network,. MKE first recognizes the nodes with high degree as the first level kernel of protein complex, and then adds the weighted best neighbour node of the first level kernel into the current kernel to form the second level kernel of the protein complex. This process is repeated, extending the current kernel to form protein complex. In the end, overlapped protein complexes are merged to form the final protein complex set.

RESULTS

Here MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.

摘要

背景

近年来,许多蛋白质复合物挖掘算法,如经典团块渗滤(CPM)方法和 Markov 聚类(MCL)算法,已经被开发出来用于蛋白质-蛋白质相互作用网络。然而,大多数现有的算法主要集中于挖掘密集的蛋白质子图作为蛋白质复合物,而没有考虑蛋白质复合物内部的固有组织结构。因此,有必要研究利用隐藏在边缘中的拓扑信息挖掘蛋白质复合物的可能性。此外,最近大量的实验分析表明蛋白质复合物具有其自身的内在组织。

方法

受复杂社交网络的团块形成过程和中心度致死规则的启发,我们提出了一种新的蛋白质复合物挖掘算法,称为多阶段核扩展(MKE)算法,该算法集成了蛋白质-蛋白质相互作用(PPI)网络中关键蛋白质识别的思想。MKE 首先识别具有高度数的节点作为蛋白质复合物的第一层核,然后将第一层核的加权最佳邻居节点添加到当前核中,形成蛋白质复合物的第二层核。这个过程不断重复,扩展当前的核以形成蛋白质复合物。最后,重叠的蛋白质复合物被合并以形成最终的蛋白质复合物集。

结果

与经典团块渗滤方法和 Markov 聚类算法相比,MKE 具有更好的准确性。MKE 在基因本体语义相似性和共定位富集方面也比经典团块渗滤方法表现更好,并且可以有效地识别 PPI 网络中具有生物学意义的蛋白质复合物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a45/4255745/de58ed00f0de/1471-2105-15-S12-S7-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验