Suppr超能文献

通过将各种特征纳入周氏伪氨基酸组成并采用反向特征选择方法预测细菌蛋白质亚细胞定位

Prediction of bacterial protein subcellular localization by incorporating various features into Chou's PseAAC and a backward feature selection approach.

作者信息

Li Liqi, Yu Sanjiu, Xiao Weidong, Li Yongsheng, Li Maolin, Huang Lan, Zheng Xiaoqi, Zhou Shiwen, Yang Hua

机构信息

Department of General Surgery, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China.

Institute of Cardiovascular Diseases of PLA, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China.

出版信息

Biochimie. 2014 Sep;104:100-7. doi: 10.1016/j.biochi.2014.06.001. Epub 2014 Jun 11.

Abstract

Information on the subcellular localization of bacterial proteins is essential for protein function prediction, genome annotation and drug design. Here we proposed a novel approach to predict the subcellular localization of bacterial proteins by fusing features from position-specific score matrix (PSSM), Gene Ontology (GO) and PROFEAT. A backward feature selection approach by linear kennel of SVM was then used to rank the integrated feature vectors and extract optimal features. Finally, SVM was applied for predicting protein subcellular locations based on these optimal features. To validate the performance of our method, we employed jackknife cross-validation tests on three low similarity datasets, i.e., M638, Gneg1456 and Gpos523. The overall accuracies of 94.98%, 93.21%, and 94.57% were achieved for these three datasets, which are higher (from 1.8% to 10.9%) than those by state-of-the-art tools. Comparison results suggest that our method could serve as a very useful vehicle for expediting the prediction of bacterial protein subcellular localization.

摘要

细菌蛋白质亚细胞定位信息对于蛋白质功能预测、基因组注释和药物设计至关重要。在此,我们提出了一种通过融合来自位置特异性得分矩阵(PSSM)、基因本体(GO)和PROFEAT的特征来预测细菌蛋白质亚细胞定位的新方法。然后使用支持向量机(SVM)线性核的反向特征选择方法对整合后的特征向量进行排序并提取最优特征。最后,基于这些最优特征应用SVM预测蛋白质亚细胞位置。为验证我们方法的性能,我们在三个低相似性数据集(即M638、Gneg1456和Gpos523)上进行了留一法交叉验证测试。这三个数据集分别取得了94.98%、93.21%和94.57%的总体准确率,比现有最先进工具的准确率高(从1.8%到10.9%)。比较结果表明,我们的方法可作为加速细菌蛋白质亚细胞定位预测的非常有用的工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验