Suppr超能文献

用于创建基于周氏伪氨基酸特征以进行亚线粒体定位的遗传编程。

Genetic programming for creating Chou's pseudo amino acid based features for submitochondria localization.

作者信息

Nanni Loris, Lumini Alessandra

机构信息

DEIS, IEIIT-CNR, Università di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy.

出版信息

Amino Acids. 2008 May;34(4):653-60. doi: 10.1007/s00726-007-0018-1. Epub 2008 Jan 4.

Abstract

Given a protein that is localized in the mitochondria it is very important to know the submitochondria localization of that protein to understand its function. In this work, we propose a submitochondria localizer whose feature extraction method is based on the Chou's pseudo-amino acid composition. The pseudo-amino acid based features are obtained by combining pseudo-amino acid compositions with hundreds of amino-acid indices and amino-acid substitution matrices, then from this huge set of features a small set of 15 "artificial" features is created. The feature creation is performed by genetic programming combining one or more "original" features by means of some mathematical operators. Finally, the set of combined features are used to train a radial basis function support vector machine. This method is named GP-Loc. Moreover, we also propose a very few parameterized method, named ALL-Loc, where all the "original" features are used to train a linear support vector machine. The overall prediction accuracy obtained by GP-Loc is 89% when the jackknife cross-validation is used, this result outperforms the performance obtained in the literature (85.2%) using the same dataset. While the overall prediction accuracy obtained by ALL-Loc is 83.9%.

摘要

对于一种定位于线粒体的蛋白质而言,了解该蛋白质在线粒体内亚结构的定位对于理解其功能非常重要。在这项工作中,我们提出了一种线粒体内亚结构定位器,其特征提取方法基于周(Chou)的伪氨基酸组成。基于伪氨基酸的特征是通过将伪氨基酸组成与数百种氨基酸指数和氨基酸替换矩阵相结合而获得的,然后从这一庞大的特征集中创建一小套15个“人工”特征。特征创建是通过遗传编程借助一些数学运算符将一个或多个“原始”特征组合起来进行的。最后,使用组合特征集训练径向基函数支持向量机。这种方法被命名为GP-Loc。此外,我们还提出了一种参数极少的方法,名为ALL-Loc,其中所有“原始”特征都用于训练线性支持向量机。当使用留一法交叉验证时,GP-Loc获得的总体预测准确率为89%,这一结果优于使用相同数据集在文献中获得的性能(85.2%)。而ALL-Loc获得的总体预测准确率为83.9%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验