Suppr超能文献

一种基于非负矩阵三因子分解的关键蛋白质识别新方法。

A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization.

作者信息

Zhang Zhihong, Jiang Meiping, Wu Dongjie, Zhang Wang, Yan Wei, Qu Xilong

机构信息

College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, China.

School of Information Technology and Management, Hunan University of Finance and Economics, Changsha, China.

出版信息

Front Genet. 2021 Aug 6;12:709660. doi: 10.3389/fgene.2021.709660. eCollection 2021.

Abstract

Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein-protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein-protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein-protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein-protein interaction networks.

摘要

识别必需蛋白质对于理解维持生物体生存的基本要求非常重要。近年来,人们越来越有兴趣使用计算方法,基于蛋白质-蛋白质相互作用(PPI)网络或融合多种生物信息来预测必需蛋白质。然而,据观察,现有的PPI数据存在假阴性和假阳性数据。融合多种生物信息可以减少PPI中虚假数据的影响,但同时不可避免地会产生更多噪声数据。在本文中,我们提出了一种基于非负矩阵三因子分解(NMTF)的新型模型(NTMEP)来预测必需蛋白质。首先,仅利用网络的拓扑特征建立加权PPI网络,以避免更多噪声。为了减少虚假数据(存在于PPI网络中)对识别必需蛋白质性能的影响,作为一种广泛使用的推荐算法,采用NMTF技术来重建一个具有更多潜在蛋白质-蛋白质相互作用的最优化PPI网络。然后,我们使用PageRank算法计算每个蛋白质的最终排名分数,其中蛋白质的亚细胞定位和同源信息用于计算初始分数。此外,在公开可用的数据集上进行了广泛的实验,结果表明我们的NTMEP模型在预测必需蛋白质方面比现有最先进的方法具有更好的性能。在这项研究中,我们证明了引入非负矩阵三因子分解技术可以有效改善蛋白质-蛋白质相互作用网络的状况,从而减少噪声对预测的负面影响。同时,这一发现为基于蛋白质-蛋白质相互作用网络的其他应用提供了一个更新颖的视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f72c/8378176/8d57fa865c8a/fgene-12-709660-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验