Suppr超能文献

使用一种新颖的AdaBoost算法和周氏伪氨基酸组成来预测蛋白质亚细胞定位。

Using a novel AdaBoost algorithm and Chou's Pseudo amino acid composition for predicting protein subcellular localization.

作者信息

Lin Jie, Wang Yan

机构信息

Department of information management and information system, College of Economic and Management.4800 Cao An Road. Shang Hai, China, 216000.

出版信息

Protein Pept Lett. 2011 Dec;18(12):1219-25. doi: 10.2174/092986611797642797.

Abstract

For a protein, an important characteristic is its location or compartment in a cell. This is because a protein has to be located in its proper position in a cell to perform its biological functions. Therefore, predicting protein subcellular location is an important and challenging task in current molecular and cellular biology. In this paper, based on AdaBoost.ME algorithm and Chou's PseAAC (pseudo amino acid composition), a new computational method was developed to identify protein subcellular location. AdaBoost.ME is an improved version of AdaBoost algorithm that can directly extend the original AdaBoost algorithm to deal with multi-class cases without the need to reduce it to multiple two-class problems. In some previous studies the conventional amino acid composition was applied to represent protein samples. In order to take into account the sequence order effects, in this study we use Chou's PseAAC to represent protein samples. To demonstrate that AdaBoost.ME is a robust and efficient model in predicting protein subcellular locations, the same protein dataset used by Cedano et al. (Journal of Molecular Biology, 1997, 266: 594-600) is adopted in this paper. It can be seen from the computed results that the accuracy achieved by our method is better than those by the methods developed by the previous investigators.

摘要

对于一种蛋白质而言,一个重要的特性是其在细胞中的位置或区室。这是因为蛋白质必须位于细胞内的适当位置才能发挥其生物学功能。因此,预测蛋白质亚细胞定位是当前分子和细胞生物学中一项重要且具有挑战性的任务。在本文中,基于AdaBoost.ME算法和周的伪氨基酸组成(PseAAC),开发了一种新的计算方法来识别蛋白质亚细胞定位。AdaBoost.ME是AdaBoost算法的改进版本,它可以直接扩展原始的AdaBoost算法以处理多类情况,而无需将其简化为多个二类问题。在一些先前的研究中,使用传统的氨基酸组成来表示蛋白质样本。为了考虑序列顺序效应,在本研究中我们使用周的伪氨基酸组成来表示蛋白质样本。为了证明AdaBoost.ME在预测蛋白质亚细胞定位方面是一个稳健且高效的模型,本文采用了Cedano等人(《分子生物学杂志》,1997年,266: 594 - 600)使用的相同蛋白质数据集。从计算结果可以看出,我们的方法所达到的准确率优于先前研究者开发的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验