Suppr超能文献

通过将进化和物理化学信息整合到 Chou 的通用 PseAAC 中,鉴定蛋白质亚细胞定位。

Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou's general PseAAC.

机构信息

School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Yaguan Road, Jinnan District, Tianjin, PR China.

School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Yaguan Road, Jinnan District, Tianjin, PR China; School of Computational Science and Engineering, University of South Carolina, Columbia, USA.

出版信息

J Theor Biol. 2019 Feb 7;462:230-239. doi: 10.1016/j.jtbi.2018.11.012. Epub 2018 Nov 16.

Abstract

Identifying the location of proteins in a cell plays an important role in understanding their functions, such as drug design, therapeutic target discovery and biological research. However, the traditional subcellular localization experiments are time-consuming, laborious and small scale. With the development of next-generation sequencing technology, the number of proteins has grown exponentially, which lays the foundation of the computational method for identifying protein subcellular localization. Although many methods for predicting subcellular localization of proteins have been proposed, most of them are limited to single-location. In this paper, we propose a multi-kernel SVM to predict subcellular localization of both multi-location and single-location proteins. First, we make use of the evolutionary information extracted from position specific scoring matrix (PSSM) and physicochemical properties of proteins, by Chou's general PseAAC and other efficient functions. Then, we propose a multi-kernel support vector machine (SVM) model to identify multi-label protein subcellular localization. As a result, our method has a good performance on predicting subcellular localization of proteins. It achieves an average precision of 0.7065 and 0.6889 on two human datasets, respectively. All results are higher than those achieved by other existing methods. Therefore, we provide an efficient system via a novel perspective to study the protein subcellular localization.

摘要

确定蛋白质在细胞中的位置在理解其功能方面起着重要作用,例如药物设计、治疗靶点发现和生物研究。然而,传统的亚细胞定位实验既耗时又费力,而且规模较小。随着下一代测序技术的发展,蛋白质的数量呈指数级增长,这为蛋白质亚细胞定位的计算方法奠定了基础。尽管已经提出了许多预测蛋白质亚细胞定位的方法,但大多数方法仅限于单定位。在本文中,我们提出了一种多核支持向量机(Multi-kernel SVM),用于预测多定位和单定位蛋白质的亚细胞定位。首先,我们利用从位置特异性评分矩阵(PSSM)和蛋白质理化性质中提取的进化信息,通过 Chou 的通用 PseAAC 和其他高效功能。然后,我们提出了一种多核支持向量机(SVM)模型来识别多标签蛋白质亚细胞定位。结果表明,我们的方法在预测蛋白质亚细胞定位方面具有良好的性能。在两个人类数据集上,分别实现了 0.7065 和 0.6889 的平均精度。所有结果均高于其他现有方法的结果。因此,我们通过一种新的视角提供了一个有效的系统来研究蛋白质亚细胞定位。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验