Suppr超能文献

基于核心附着结构和功能注释的蛋白质复合物预测方法。

Protein Complexes Prediction Method Based on Core-Attachment Structure and Functional Annotations.

机构信息

College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China.

出版信息

Int J Mol Sci. 2017 Sep 6;18(9):1910. doi: 10.3390/ijms18091910.

Abstract

Recent advances in high-throughput laboratory techniques captured large-scale protein-protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core-attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests.

摘要

近年来,高通量实验室技术取得了重大进展,捕获了大规模的蛋白质-蛋白质相互作用(PPI)数据,使得创建详细的蛋白质相互作用网络图谱成为可能,从而使我们能够从这些 PPI 网络中检测蛋白质复合物。然而,大多数当前最先进的研究仍然存在一些问题,例如,无法识别重叠的簇,没有考虑蛋白质复合物内部的固有组织,并且忽略了复合物的生物学意义。因此,我们提出了一种基于核心-附属结构和功能注释(CFOCM)的新颖重叠蛋白质复合物预测方法,该方法分两个阶段进行:首先,使用我们定义的簇接近函数的最大值来检测蛋白质复合物核心,其中蛋白质也与至少一个共同功能密切相关。然后,将附属蛋白附加到这些检测到的核心中,形成返回的复合物。为了进行性能评估,我们使用 CFOCM 和六种经典方法在三个不同的酵母 PPI 网络上识别蛋白质复合物,并选择慕尼黑蛋白质序列信息中心(MIPS)、酿酒酵母基因组数据库(SGD)和酵母蛋白质复合物目录(CYC2008)这三个真实的复合物集作为基准集,结果表明,CFOCM 在所有测试中确实能够有效地达到最高的 F 度量值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcc0/5618559/f857a9ba3f9f/ijms-18-01910-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验