Suppr超能文献

利用位置特异性图谱和带加权输入的神经网络预测真核生物蛋白质的亚细胞定位

Prediction of subcellular localization of eukaryotic proteins using position-specific profiles and neural network with weighted inputs.

作者信息

Zou Lingyun, Wang Zhengzhi, Huang Jiaomin

机构信息

College of Mechatronics and Automation, National University of Defense Technology, Changsha 410073, China.

出版信息

J Genet Genomics. 2007 Dec;34(12):1080-7. doi: 10.1016/S1673-8527(07)60123-4.

Abstract

Subcellular location is one of the key biological characteristics of proteins. Position-specific profiles (PSP) have been introduced as important characteristics of proteins in this article. In this study, to obtain position-specific profiles, the Position Specific Iterative-Basic Local Alignment Search Tool (PSI-BLAST) has been used to search for protein sequences in a database. Position-specific scoring matrices are extracted from the profiles as one class of characteristics. Four-part amino acid compositions and 1st-7th order dipeptide compositions have also been calculated as the other two classes of characteristics. Therefore, twelve characteristic vectors are extracted from each of the protein sequences. Next, the characteristic vectors are weighed by a simple weighing function and inputted into a BP neural network predictor named PSP-Weighted Neural Network (PSP-WNN). The Levenberg-Marquardt algorithm is employed to adjust the weight matrices and thresholds during the network training instead of the error back propagation algorithm. With a jackknife test on the RH2427 dataset, PSP-WNN has achieved a higher overall prediction accuracy of 88.4% rather than the prediction results by the general BP neural network, Markov model, and fuzzy k-nearest neighbors algorithm on this dataset. In addition, the prediction performance of PSP-WNN has been evaluated with a five-fold cross validation test on the PK7579 dataset and the prediction results have been consistently better than those of the previous method on the basis of several support vector machines, using compositions of both amino acids and amino acid pairs. These results indicate that PSP-WNN is a powerful tool for subcellular localization prediction. At the end of the article, influences on prediction accuracy using different weighting proportions among three characteristic vector categories have been discussed. An appropriate proportion is considered by increasing the prediction accuracy.

摘要

亚细胞定位是蛋白质的关键生物学特性之一。本文引入了位置特异性谱(PSP)作为蛋白质的重要特性。在本研究中,为了获得位置特异性谱,使用位置特异性迭代基本局部比对搜索工具(PSI-BLAST)在数据库中搜索蛋白质序列。从这些谱中提取位置特异性评分矩阵作为一类特征。还计算了四部分氨基酸组成和一阶至七阶二肽组成作为另外两类特征。因此,从每个蛋白质序列中提取十二个特征向量。接下来,通过一个简单的加权函数对特征向量进行加权,并将其输入到一个名为PSP加权神经网络(PSP-WNN)的BP神经网络预测器中。在网络训练过程中,采用Levenberg-Marquardt算法来调整权重矩阵和阈值,而不是误差反向传播算法。通过对RH2427数据集进行留一法检验,PSP-WNN在该数据集上取得了88.4%的更高总体预测准确率,高于普通BP神经网络、马尔可夫模型和模糊k近邻算法的预测结果。此外,通过对PK7579数据集进行五折交叉验证测试来评估PSP-WNN的预测性能,基于氨基酸和氨基酸对组成的几种支持向量机的预测结果始终优于先前方法。这些结果表明,PSP-WNN是亚细胞定位预测的有力工具。在文章结尾,讨论了在三类特征向量中使用不同加权比例对预测准确率的影响。通过提高预测准确率来考虑合适的比例。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验