Suppr超能文献

基于差异共表达和邻域分析预测蛋白质功能。

Predicting Protein Functions Based on Differential Co-expression and Neighborhood Analysis.

机构信息

School of Computer Science and Technology, Dalian University of Technology, Dalian, China.

School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya.

出版信息

J Comput Biol. 2021 Jan;28(1):1-18. doi: 10.1089/cmb.2019.0120. Epub 2020 Apr 17.

Abstract

Proteins are polypeptides essential in biological processes. Protein physical interactions are complemented by other types of functional relationship data including genetic interactions, knowledge about co-expression, and evolutionary pathways. Existing algorithms integrate protein interaction and gene expression data to retrieve context-specific subnetworks composed of genes/proteins with known and unknown functions. However, most protein function prediction algorithms fail to exploit diverse intrinsic information in feature and label spaces. We develop a novel integrative method based on differential Co-expression analysis and Neighbor-voting algorithm for Protein Function Prediction, namely CNPFP. The method integrates heterogeneous data and exploits intrinsic and latent linkages via global iterative approach and genomic features. CNPFP performs three tasks: clustering, differential co-expression analysis, and predicts protein functions. Our aim is to identify yeast cell cycle-specific proteins linked to differentially expressed proteins in the protein-protein interaction network. To capture intrinsic information, CNPFP selects the most relevant feature subset based on global iterative neighbor-voting algorithm. We identify eight condition-specific modules. The most relevant subnetwork has 87 genes highly enriched with cyclin-dependent kinases, a protein kinase relevant for cell cycle regulation. We present comprehensive annotations for 3538 proteins. Our method achieves an AUROC of 0.9862, accuracy of 0.9710, and -score of 0.9691. From the results, we can summarize that exploiting intrinsic nature of protein relationships improves the quality of function prediction. Thus, the proposed method is useful in functional genomics studies.

摘要

蛋白质是生物过程中必不可少的多肽。蛋白质的物理相互作用通过其他类型的功能关系数据得到补充,包括遗传相互作用、共表达知识和进化途径。现有的算法整合蛋白质相互作用和基因表达数据,以检索由具有已知和未知功能的基因/蛋白质组成的特定于上下文的子网。然而,大多数蛋白质功能预测算法未能利用特征和标签空间中的多种内在信息。我们开发了一种基于差异共表达分析和邻域投票算法的新型整合方法,用于蛋白质功能预测,即 CNPFP。该方法整合了异构数据,并通过全局迭代方法和基因组特征利用内在和潜在的联系。CNPFP 执行三个任务:聚类、差异共表达分析和预测蛋白质功能。我们的目标是识别与蛋白质-蛋白质相互作用网络中差异表达蛋白质相关的酵母细胞周期特异性蛋白质。为了捕获内在信息,CNPFP 基于全局迭代邻域投票算法选择最相关的特征子集。我们确定了八个条件特异性模块。最相关的子网包含 87 个基因,这些基因高度富含细胞周期蛋白依赖性激酶,这是一种与细胞周期调控相关的蛋白激酶。我们对 3538 个蛋白质进行了全面注释。我们的方法实现了 AUROC 为 0.9862、准确率为 0.9710 和 -score 为 0.9691。从结果可以总结出,利用蛋白质关系的内在性质可以提高功能预测的质量。因此,所提出的方法在功能基因组学研究中是有用的。

相似文献

10
Defining transcription modules using large-scale gene expression data.利用大规模基因表达数据定义转录模块。
Bioinformatics. 2004 Sep 1;20(13):1993-2003. doi: 10.1093/bioinformatics/bth166. Epub 2004 Mar 25.

本文引用的文献

4
HashGO: hashing gene ontology for protein function prediction.HashGO:用于蛋白质功能预测的基因本体哈希法
Comput Biol Chem. 2017 Dec;71:264-273. doi: 10.1016/j.compbiolchem.2017.09.010. Epub 2017 Oct 4.
7
EGAD: ultra-fast functional analysis of gene networks.EGAD:基因网络的超快速功能分析
Bioinformatics. 2017 Feb 15;33(4):612-614. doi: 10.1093/bioinformatics/btw695.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验