基于边缘特征和多源生物信息融合的必需蛋白质鉴定。

Identification of essential proteins based on edge features and the fusion of multiple-source biological information.

机构信息

School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China.

College of Oceanography and Space Informatics, China University of Petroleum (East China), Qingdao, China.

出版信息

BMC Bioinformatics. 2023 May 17;24(1):203. doi: 10.1186/s12859-023-05315-y.

DOI:10.1186/s12859-023-05315-y

PMID:37198530

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10193741/

Abstract

BACKGROUND

A major current focus in the analysis of protein-protein interaction (PPI) data is how to identify essential proteins. As massive PPI data are available, this warrants the design of efficient computing methods for identifying essential proteins. Previous studies have achieved considerable performance. However, as a consequence of the features of high noise and structural complexity in PPIs, it is still a challenge to further upgrade the performance of the identification methods.

METHODS

This paper proposes an identification method, named CTF, which identifies essential proteins based on edge features including h-quasi-cliques and uv-triangle graphs and the fusion of multiple-source information. We first design an edge-weight function, named EWCT, for computing the topological scores of proteins based on quasi-cliques and triangle graphs. Then, we generate an edge-weighted PPI network using EWCT and dynamic PPI data. Finally, we compute the essentiality of proteins by the fusion of topological scores and three scores of biological information.

RESULTS

We evaluated the performance of the CTF method by comparison with 16 other methods, such as MON, PeC, TEGS, and LBCC, the experiment results on three datasets of Saccharomyces cerevisiae show that CTF outperforms the state-of-the-art methods. Moreover, our method indicates that the fusion of other biological information is beneficial to improve the accuracy of identification.

摘要

背景

目前，蛋白质-蛋白质相互作用（PPI）数据分析的一个主要焦点是如何识别必需蛋白质。由于大量的 PPI 数据可用，这就需要设计有效的计算方法来识别必需蛋白质。以前的研究已经取得了相当大的成果。然而，由于 PPI 中存在高噪声和结构复杂性的特点，进一步提高识别方法的性能仍然是一个挑战。

方法

本文提出了一种识别方法，称为 CTF，它基于边特征（包括 h-拟簇和 uv-三角形图）和多源信息融合来识别必需蛋白质。我们首先设计了一种边权重函数，称为 EWCT，用于基于拟簇和三角形图计算蛋白质的拓扑分数。然后，我们使用 EWCT 和动态 PPI 数据生成一个边加权 PPI 网络。最后，我们通过融合拓扑分数和三种生物信息分数来计算蛋白质的必需性。

结果

我们通过与其他 16 种方法（如 MON、PeC、TEGS 和 LBCC）进行比较，评估了 CTF 方法的性能，实验结果表明，在三个酿酒酵母数据集上，CTF 方法优于最先进的方法。此外，我们的方法表明融合其他生物信息有助于提高识别的准确性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于边缘特征和多源生物信息融合的必需蛋白质鉴定。

Identification of essential proteins based on edge features and the fusion of multiple-source biological information.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

背景

方法

结果

相似文献

引用本文的文献

本文引用的文献

基于边缘特征和多源生物信息融合的必需蛋白质鉴定。

Identification of essential proteins based on edge features and the fusion of multiple-source biological information.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

背景

方法

结果

相似文献

引用本文的文献

本文引用的文献